Comment by tovej
2 months ago
Because they don't. The chain-of-reasoning feature is really just a way to get the LLM to prompt more.
The fact that it generates these "thinking" steps does not mean it is using them for reasoning. It's most useful effect is making it seem to a human that there is a reasoning process.
I love how generating strings like "let me check my notes" is effective at ending up with somewhat better end results - it pushes the weights towards outputting text that appears to be written by someone who did check their notes :D
I can't remember which lecture it was, but a guy said "they don't think, they only seem to think, and they won't replace a substantial portion of human labor, they will only seem to do so" ;)
Joking aside, this is exactly what happens with companies announcing "AI" replacing human labor when what they actually do is correcting for COVID-time overhiring while trying to make it appear in a way that won't make the stocks go too red.
Is this position axiomatic or falsifiable? What would it take to change your mind?
It doesn't have to be either because the burden of proof is not on me. It's on whoever claims that chaining multiple prompts together produces thinking, even though a single prompt is just predicting n-grams.
The chain does not change the token generation process, it just artificially lengthens it.
How would you determine humans have reasoning then, in a way that LLMs do not?
Easy, humans can synthesize new facts using logic and context. And the conclusions can be checked against the real world to check that the reasoning was correct.
Or another way: reasoning is a socially constructed concept, developed by humans. Humans therefore have defined reasoning, and must therefore know how to reason.
Or a third way: I experience reasoning, you experience reasoning. I am currently reasoning. You are currently reasoning. I am human, as are you. Therefore humans reason.
Or — here's a fun one — subjective experience.
This one is even easier. LLMs record objective data about n-gram distribution, there is no room for any "subjective state" in their working set.
Or another way: an LLM will respond the same way if you wait for one second or a decade between prompts. The only way it is interacted with is through a stream of tokens. There is exactly one stream at all times, and each time the stream is input again, it is barely different from the previous input. The LLM does not behave differently depending on the contents of the stream. It may produce the exact same token for two different streams. It may also encounter the same stream twice, and will act the same in both cases. If it were a "subject" "experiencing" say, a discussion, it would use its pasts "experience". But it does not.
They haven't replied to my comment but have to yours, so I can only assume they actually cannot point out the difference, which makes sense as the philosophy of mind is a very old subject and there is no way a threaded conversation like this would produce any concrete answers.