Comment by benrutter
8 hours ago
Not OP, but I think the argument here would be not that LLMs "are not smart" but that smart is just the wrong category of thing to describe an LLM as.
A calculator can do very complex sums very quickly, but we don't tend to call it "smart" because we don't think it's operating intelligently to some internal model of the world. I think the "LLMs are AGI" crowd would say that LLMs are, but it's perfectly consistent to think the output of LLMs is consistent/impressive/useful, but still maintain that they aren't "smart" in any meaningful way.
> "we don't think it's operating intelligently to some internal model of the world"
Okay, but you have to actually address why you think LLMs lack an "internal model of the world"
You can train one on 1930s text, and then teach it Python in-context.
They've produced multiple novel mathematical proofs now; Terrance Tao is impressed with them as research assistants.
You can very clearly ask them questions about the world, and they'll produce answers that match what you'd get from a "model" of the world.
What are weights, if not a model of the world? It's got a very skewed perspective, certainly, since it's terminally online and has never touched grass, but it still very clearly has a model of the world.
I'd dare say it's probably a more accurate model than the average person has, too, thanks to having Wikipedia and such baked in.
Intelligence can be defined as an optimization problem: "find X which maximizes F(X, Y)" where X is the solution, Y is constraints, and F is optimality/fitness criterion. Most other definitions are inane. E.g. "invent an aircraft" can be described as optimization over possible build instructions under given constraints for base materials which optimizes its ability to fly. Absolutely any invention can be formulated as an optimization problem.
It's not like a calculator because LLM can solve very broad classes of problems - you'd struggle to define problems which LLM can't solve (given some fine-tuning, harness, KB, etc).
All this talk about "smartness" isn't even particularly cute...
> It's not like a calculator because LLM can solve very broad classes of problems
So can computer programs. Are computer programs intelligent?
A specific program solves only a specific, narrow class of problems.
If you make a program which can solve many different classes of problems that's called AI.
I would analogize LLMs to physics simulations in software. Game engines, for example, simulate physics enough to provide a good enough semblance of real-world physics for suspension of disbelief but we would never mistake it for real world physics. Complicated enough simulations, e.g. for weather forecasting, nuclear weapons, or QCD, can provide insights and prove physics theories, but again, experts would never mistake it for real world physics and would be able to explain where the simulation breaks down when trying to predict real world behavior.
Now we have these LLMs that provide some simulation of reasoning merely through prediction of token patterns and that is indeed unexpected and astonishing. However, the AI promoters want to suggest that this simulation of reasoning is human-level reasoning or evolving toward human-level reasoning and this is the same as mistaking game engine physics for real physics. The failure cases (e.g. the walk vs drive to a car wash next door question or the generating an image of a full glass of wine issue), even if patched away, are enough to reveal the token predictor underneath.