Comment by dnautics

4 months ago

a tombstone "token "doesnt have to be an actual token, nor does it have to be explicitly carved out into the tokenizer. it can be learned. unless you have looked into the activations of a SOTA llm you cant categorically say that one (or 80% of one, fir example) doesn't exist.

2 comments

dnautics

D-Machine 4 months ago

We CAN categorically say that no such token or cluster of tokens exists, because we know how LLMs and tokenizers work.

Current LLM implementations cannot delete output text, i.e. they cannot remove text from their context window. The recursive application is such that outputs are always expanding what is in the window, so there is no backtracking like humans can do, i.e. "this text was bad, ignore it and remove from context". That's part of why we got crazy loops / spirals like we did with the "show me the seahorse emoji" prompts.

Backtracking needs more than just a special token or cluster of tokens, but also for the LLM behaviour to be modified when it sees that token or token cluster. This must be manually coded in, it cannot be learned.

dnautics 4 months ago

without claiming this is actually happening, it is certainly possible to synthetically create a token that ablates the values retrieved from queries in a certain relative time range (via transformations induced by e.g. RoPE encoding)