Comment by variadix

6 months ago

Using more tokens = more compute to use for a given problem. I think most of the benefit of CoT has more to do with autoregressive models being unable to “think ahead” and revise their output, and less to do with actual reasoning. The fact that an LLM can have incorrect reasoning in its CoT and still produce the right answer, or that it can “lie” in its CoT to avoid being detected as cheating on RL tasks, makes me believe that the semantic content of CoT is an illusion, and that the improved performance is from being able to explore and revise in some internal space using more compute before producing a final output.

0 comments

variadix

No comments yet

Contribute on Hacker News ↗