Comment by mort96
19 hours ago
If I'm reading you right, your opinion is essentially: "If building bigger and bigger statistical next word predictors won't lead to artificial general intelligence, we will never see artificial general intelligence"
I don't know, maybe AGI is possible but there's more to intelligence than statistical next word prediction?
Its not a statistical next word predictor.
The 'predicting the next word' is the learning mechanism of the LLM which leads to a latent space which can encode higher level concepts.
Basically a LLM 'understands' that much as efficient as it has to be to be able to respond in a reasonable way.
A LLM doesn't predict german text or chinese language. It predicts the concept and than has a language layer outputting tokens.
And its not just LLMs which are progressing fast, voice synt and voice understanding jumped significantly, motion detection, skeletion movement, virtual world generation (see nvidias way of generating virutal worlds for their car training), protein folding etc.
I'm sorry but the input to a model is a sequence of tokens and the output is a probability distribution of what's the most likely next token. It's a very very very fancy next token predictor but that is fundamentally what it is. I'm making the argument that this paradigm might not give rise to a general intelligence no matter how much you scale it.
It's a very very very fancy next token predictor
Yes, and unless you are prepared to rebut the argument with evidence of the supernatural, that's all there is, period. That's all we are.
So tired of the thought-terminating "stochastic parrot" argument.
30 replies →
> Its not a statistical next word predictor.
it absolutely is a next word predictor
LLM proponents believe that these higher level encodings in latent space do in fact match the real world concepts described by our language(s).
However, a much simpler explanation for what we see with LLMs is that instead the higher level encodings in latent space match only the patterns of our language(s), and no deeper encoding/understanding is present.
It's Plato's Cave - the shadows on the wall are all an LLM ever sees, and somehow it is expected to derive the real reality behind them.
Could be, yes for sure but I think it would be very naive in the current state of progress we are in, to down play what progress is happening.
At least Mythos model with its 10 Trillion parameter might indicate that the scaling law is valid. Its a little bit unfortunate that we still don't know that much more about that model.