Comment by spiorf
3 months ago
We know how the next token is selected, but not why doing that repeatedly brings all the capabilities it does. We really don't understand how the emergent behaviours emerge.
3 months ago
We know how the next token is selected, but not why doing that repeatedly brings all the capabilities it does. We really don't understand how the emergent behaviours emerge.
It feels less like a word prediction algorithm and more like a world model compression algorithm. Maybe we tried to create one and accidentaly created the other?
Its almost like a Model of Language, but very Large
Why would asking a question about ice cream trigger a consideration about all possible topics? As in, to formulate the answer, the LLM will consider the origin of Elephants even. It won’t be significant, but it will be factored in.
Why? In the spiritual realm, many postulated that even the Elephant you never met is part of your life.
None of this is a coincidence.
Eh I feel like that mostly just down to; yes transformers are a "next token predictor" but during fine tuning for instruct the attention related wagon slapped on the back is partially hijacked as a bridge from input token->sequences of connections in the weights.
For example if I ask "If I have two foxes and I take away one, how many foxes do I have?" I reckon attention has been hijacked to essentially highlight the "if I have x and take away y then z" portion of the query to connect to a learned sequence from readily available training data (apparently the whole damn Internet) where there are plenty of examples of said math question trope, just using some other object type than foxes.
I think we could probably prove it by tracing the hyperdimensional space the model exists in and ask it variants of the same question/find hotspots in that space that would indicate it's using those same sequences (with attention branching off to ensure it replies with the correct object type that was referenced).