← Back to context

Comment by rufo

1 hour ago

The reasoning tokens are really just there to extend the amount the LLM can "compute" the problem; put another way, the only way a given model can "think" more about a problem is to fill more of its context with predicted tokens, which has the effect of increasing the accuracy of each token. The reinforcement learning these models go through generally doesn't care what the chain of thought tokens look like (outside of preventing loops/gibberish/reward hacking), only how good the final answer is - so while it does look something like "reasoning" to us and has a rough correlation with the final answer, treating it as actually representative of what the final answer will be or an actual thought process is giving those tokens too much credit :)

For me what really drove this point home (that reasoning traces aren't "real" by any reasonable definition of the term) was noticing instances of things being out of order and exhibiting various inconsistencies with the final answer. My favorite was an example posted to HN that went something along the lines of the model first output the conclusion, then performed the supposed derivation after the fact, then stated it needed to verify the earlier conclusion to verify the derivation was correct so it hallucinated a tool call, then it remarked positively about the verification matching, and finally it output a slightly different answer. At no point was the answer actually correct although it was vaguely in the ballpark.