← Back to context

Comment by tibbar

6 days ago

I think this statement is on the same level as "a human cannot explain why they gave the answer they gave because they cannot actually introspect the chemical reactions in their brain." That is true, but a human often has an internal train of thought that preceded their ultimate answer, and it is interesting to know what that train of thought was.

In the same way, it is often quite instructive to know what the reasoning trace was that preceded an LLM's answer, without having to worry about what, mechanically, the LLM "understood" about the tokens, if this is even a meaningful question.

But it's not a reasoning trace. Models could produce one if they were designed to (an actual stack of the calls and the states of the tensors with each call, probably with a helpful lookup table for the tokens) but they specifically haven't been made to do that.

  • When you put an LLM in reasoning mode, it will approximately have a conversation with itself. This mimics an inner monologue.

    That conversation is held in text, not in any internal representation. That text is called the reasoning trace. You can then analyse that trace.

    • Unless things have changed drastically in the last 4 months (the last time I looked at it) those traces are not stored but reconstructed when asked. Which is still the same problem.

      4 replies →