← Back to context

Comment by wzdd

5 hours ago

> The mechanism that the model uses to transition towards the answer is to generate intermediate text.

Yes, which makes sense, because if there's a landscape of states that the model is traversing, and there are probablistically likely pathways between an initial state and the desired output, but there isn't a direct pathway, then training the the model to generate intermediate text in order to move across that landscape so it can reach the desired output state is a good idea.

Presumably LLM companies are aware that there is (in general) no relationship between the generated intermediate text and the output, and the point of the article is that by calling it a "chain of thought" rather than "essentially-meaningless intermediate text which increases the number of potential states the model can reach" users are misled into thinking that the model is reasoning, and may then make unwarranted assumptions, such as that the model could in general apply the same reasoning to similar problems, which is in general not true.

Meaningless? The participation in a usefully predicting path is meaning. A different meaning.

And Gemini has a note at the bottom about mistakes, and many people discuss this. Caveat emptor, as usual.