Comment by AntiUSAbah
1 day ago
Its not a statistical next word predictor.
The 'predicting the next word' is the learning mechanism of the LLM which leads to a latent space which can encode higher level concepts.
Basically a LLM 'understands' that much as efficient as it has to be to be able to respond in a reasonable way.
A LLM doesn't predict german text or chinese language. It predicts the concept and than has a language layer outputting tokens.
And its not just LLMs which are progressing fast, voice synt and voice understanding jumped significantly, motion detection, skeletion movement, virtual world generation (see nvidias way of generating virutal worlds for their car training), protein folding etc.
I'm sorry but the input to a model is a sequence of tokens and the output is a probability distribution of what's the most likely next token. It's a very very very fancy next token predictor but that is fundamentally what it is. I'm making the argument that this paradigm might not give rise to a general intelligence no matter how much you scale it.
It's a very very very fancy next token predictor
Yes, and unless you are prepared to rebut the argument with evidence of the supernatural, that's all there is, period. That's all we are.
So tired of the thought-terminating "stochastic parrot" argument.
Do LLMs even learn? The companies that build them build new models based partly on the conversations the older models have had with people, but do they incorporate knowledge into their neural nets as they go along?
Can an LLM decide, without prompting or api calls, to text someone or go read about something or do anything at all except for waiting for the next prompt?
Do LLMs have any conceptual understanding of anything they output? Do they even have a mechanism for conceptual understanding?
LLMs are incredibly useful and I'm having a lot of fun working with them, but they are a long way from some kind of general intelligence, at least as far as I understand it.
2 replies →
I'm not sure why you think you know the human brain works through predicting the next token.
It's not supernatural, I believe that an artificial intelligence is possible because I believe human intelligence is just a clever arrangement of matter performing computation, but I would never be presumptuous enough to claim to know exactly how that mechanism works.
My opinion is that human intelligence might be what's essentially a fancy next token predictor, or it might work in some completely different way, I don't know. Your claim is that human intelligence is a next token predictor. It seems like the burden on proof is on you.
26 replies →
> Its not a statistical next word predictor.
it absolutely is a next word predictor
LLM proponents believe that these higher level encodings in latent space do in fact match the real world concepts described by our language(s).
However, a much simpler explanation for what we see with LLMs is that instead the higher level encodings in latent space match only the patterns of our language(s), and no deeper encoding/understanding is present.
It's Plato's Cave - the shadows on the wall are all an LLM ever sees, and somehow it is expected to derive the real reality behind them.
Could be, yes for sure but I think it would be very naive in the current state of progress we are in, to down play what progress is happening.
At least Mythos model with its 10 Trillion parameter might indicate that the scaling law is valid. Its a little bit unfortunate that we still don't know that much more about that model.