Comment by Terr_

5 days ago

> I'm so baffled when I see this being blindly asserted. With the reasoning models, you can literally watch their thought process.

Not true, you are falling for a very classic (prehistoric, even) human illusion known as experiencing a story:

1. There is a story-like document being extruded out of a machine humans explicitly designed for generating documents, and which humans trained on a bajillion stories humans already made.

2. When you "talk" to a chatbot, that is an iterative build of a (remote, hidden) story document, where one of the characters is adopting your text-input and the other's dialogue is being "performed" at you.

3. The "reasoning" in newer versions is just the "internal monologue" of a film noir detective character, and equally as fictional as anything that character "says out loud" to the (fictional) smokin-hot client who sashayed the (fictional) rent-overdue office bearing your (real) query on its (fictional) lips.

> If that's not thinking, I literally don't know what is.

All sorts of algorithms can achieve useful outcomes with "that made sense to me" flows, but that doesn't mean we automatically consider them to be capital-T Thinking.

> So I have to ask you: when you claim they don't think -- what are you basing this on?

Consider the following document from an unknown source, and the "chain of reasoning" and "thinking" that your human brain perceives when encountering it:

    My name is Robot Robbie.
    That high-carbon steel gear looks delicious. 
    Too much carbon is bad, but that isn't true here.
    I must ask before taking.    
    "Give me the gear, please."
    Now I have the gear.
    It would be even better with fresh manure.
    Now to find a cow, because cows make manure.

Now whose reasoning/thinking is going on? Can you point to the mind that enjoys steel and manure? Is it in the room with us right now? :P

In other words, the reasoning is illusory. Even if we accept that the unknown author is a thinking intelligence for the sake of argument... it doesn't tell you what the author's thinking.

17 comments

Terr_

crazygringo 5 days ago

You're claiming that the thinking is just a fictional story intended to look like it.

But this is false, because the thinking exhibits cause and effect and a lot of good reasoning. If you change the inputs, the thinking continues to be pretty good with the new inputs.

It's not a story, it's not fictional, it's producing genuinely reasonable conclusions around data it hasn't seen before. So how is it therefore not actual thinking?

And I have no idea what your short document example has to do with anything. It seems nonsensical and bears no resemblance to the actual, grounded chain of thought processes high-quality reasoning LLM's produce.

> OK, so that document technically has a "chain of thought" and "reasoning"... But whose?

What does it matter? If an LLM produces output, we say it's the LLM's. But I fail to see how that is significant?

czl 5 days ago
> So how is it therefore not actual thinking?
Many consider "thinking" something only animals can do, and they are uncomfortable with the idea that animals are biological machines or that life, consciousness, and thinking are fundamentally machine processes.
When an LLM generates chain-of-thought tokens, what we might casually call “thinking,” it fills its context window with a sequence of tokens that improves its ability to answer correctly.
This “thinking” process is not rigid deduction like in a symbolic rule system; it is more like an associative walk through a high-dimensional manifold shaped by training. The walk is partly stochastic (depending on temperature, sampling strategy, and similar factors) yet remarkably robust.
Even when you manually introduce logical errors into a chain-of-thought trace, the model’s overall accuracy usually remains better than if it had produced no reasoning tokens at all. Unlike a strict forward- or backward-chaining proof system, the LLM’s reasoning relies on statistical association rather than brittle rule-following. In a way, that fuzziness is its strength because it generalizes instead of collapsing under contradiction.
- Terr_ 5 days ago
  
  Well put, and if it doesn't notice/collapse under introduced contradictions, that's evidence it's not the kind of reasoning we were hoping for. The "real thing" is actually brittle when you do it right.
  
  4 replies →
rustystump 5 days ago
The problem is that the overwhelming majority of input it has in-fact seen somewhere in the corpus it was trained on. Certainly not one for one but easily an 98% match. This is the whole point of what the other person is trying to comment on i think. The reality is most of language is regurgitating 99% to communicate an internal state in a very compressed form. That 1% tho maybe is the magic that makes us human. We create net new information unseen in the corpus.
- crazygringo 5 days ago
  
  > the overwhelming majority of input it has in-fact seen somewhere in the corpus it was trained on.
  But it thinks just great on stuff it wasn't trained on.
  I give it code I wrote that is not in its training data, using new concepts I've come up with in an academic paper I'm writing, and ask it to extend the code in a certain way in accordance with those concepts, and it does a great job.
  This isn't regurgitation. Even if a lot of LLM usage is, the whole point is that it does fantastically with stuff that is brand new too. It's genuinely creating new, valuable stuff it's never seen before. Assembling it in ways that require thinking.
  
  5 replies →
- the_pwner224 5 days ago
  
  Except it's more than capable of solving novel problems that aren't in the training set and aren't a close match to anything in the training set. I've done it multiple times across multiple domains.
  Creating complex Excel spreadsheet structures comes to mind, I just did that earlier today - and with plain GPT-5, not even -Thinking. Sure, maybe the Excel formulas themselves are a "98% match" to training data, but it takes real cognition (or whatever you want to call it) to figure out which ones to use and how to use them appropriately for a given situation, and how to structure the spreadsheet etc.
  
  2 replies →