← Back to context

Comment by senordevnyc

17 hours ago

I thought the consensus was that models couldn’t actually introspect like this. So there’s no reason to think any of those reasons are actually why the model did what it did, right? Has this changed?

This argument has become a moot discussion. Humans are also not able to introspect their own neural wiring to the point where they could describe the "actual" physical reason for their decisions. Just like LLMs, the best we can do is verbalize it (which will naturally contain post-act rationalization), which in turn might offer additional insight that will steer future decisions. But unlike LLMs, we have long term persistent memory that encodes these human-understandable thoughts into opaque new connections inside our neural network. At this point the human moat (if you can call it that) is dynamic long term memory, not intelligence.

  • I think many humans engage in metacognitive reasoning, and that this might not be strongly represented in training data so it probably isn't common to LLMs yet. They can still do it when prompted though.

    • LLMs have zero metacognition. Don't be fooled - their output is stochastic inference and they have no self-awareness. The best you'll see is an improvised post-hoc rationalization story.

      2 replies →