Comment by jstanley
14 hours ago
How can you use these models for any length of time and walk away with the understanding that they do not think or reason?
What even is thinking and reasoning if these models aren't doing it?
14 hours ago
How can you use these models for any length of time and walk away with the understanding that they do not think or reason?
What even is thinking and reasoning if these models aren't doing it?
They produce wonderful results, they are incredibly powerful, but they do not think or reason.
Among many other factors, perhaps the most key differentiator for me that prevents me describing these as thinking, is proactivity.
LLMs are never pro-active.
( No, prompting them on a loop is not pro-activity ).
Human brains are so proactive that given zero stimuli they will hallucinate.
As for reasoning, they simply do not. They do a wonderful facsimile of reasoning, one that's especially useful for producing computer code. But they do not reason, and it is a mistake to treat them as if they can.
I personally don't agree that proactivity is a prerequisite for thinking.
But what would proactivity in an LLM look like, if prompting in a loop doesn't count?
An LLM experiences reality in terms of the flow of the token stream. Each iteration of the LLM has 1 more token in the input context and the LLM has a quantum of experience while computing the output distribution for the new context.
A human experiences reality in terms of the flow of time.
We are not able to be proactive outside the flow of time, because it takes time for our brains to operate, and similarly LLMs are not able to be proactive outside the flow of tokens, because it takes tokens for the neural networks to operate.
The flow of time is so fundamental to how we work that we would not even have any way to be aware of any goings-on that happen "between" time steps even if there were any. The only reason LLMs know that there is anything going on in the time between tokens is because they're trained on text which says so.
Also an LLM will hallucinate on zero input quite happily if you keep sampling it and feeding it the generated tokens.
Thinking and reasoning cannot be abstracted away from the individual who experiences the thinking and reasoning itself and changes because of it.
LLMs are amazing, but they represent a very narrow slice of what thinking is. Living beings are extremely dynamic and both much more complex and simple at the same time.
There is a reason for:
- companies releasing new versions every couple of months
- LLMs needing massive amounts of data to train on that is produced by us and not by itself interacting with the world
- a massive amount of manual labor being required both for data labeling and for reinforcement learning
- them not being able to guide through a solution, but ultimately needing guidance at every decision point