Comment by zer00eyz
1 year ago
Yes:
An LLM isnt a model of human thinking.
An LLM is an attempt to build a simulation of human communication. An LLM is to language what a forecast is to weather. No amount of weather data is actually going to turn that simulation into snow, no amount of LLM data is going to create AGI.
That having been said, better models (smaller, more flexible ones) are going to result in a LOT of practical uses that have the potential to make our day to day lives easier (think digital personal assistant that has current knowledge).
Great comment. Just one thought: Language, unlike weather, is meta-circular. All we know about specific words or sentences is again encoded in words and sentences. So the embedding encodes a subset of human knowledge.
Hence, a LLM is predicting not only language but language with some sort of meaning.
That re-embeding is also encoded in weather. It is why perfect forecasting is impossible, why we talk about the butterfly effect.
The "hallucination problem" is simply the tyranny of Lorenz... one is not sure if a starting state will have a good outcome or swing wildly. Some good weather models are based on re-runing with tweaks to starting params, and then things that end up out of bounds can get tossed. Its harder to know when a result is out of bounds for an LLM, and we dont have the ability to run every request 100 times through various models to get an "average" output yet... However some of the reuse of layers does emulate this to an extent....
Ugh. Really? Those "simulated water isn't wet"(when applied to cognition) "arguments" were punched so many times it even hurts to look at them.
No simulated water isnt wet.
But an LLM isn't even trying to simulate cognition. It's a model that is predicting language. It has all the problems of a predictive model... the "hallucination" problem is just the tyranny of Lorenz.
We don't really know what "cognition" is, so it's hard to tell whether a system is doing it.
This is plain wrong due to mixing of concepts. Language is technically something from Chomsky hierarchy. Predicting language is being able to tell if input is valid or invalid. LLMs do that, but they also build a statistical model across all valid inputs, and that is not just the language.
2 replies →