← Back to context

Comment by skerit

4 days ago

> When controlling for the number of tokens, NoThinking outperforms Thinking across a diverse set of seven challenging reasoning datasets

Interesting. I thought the "thinking" was useful because it pulls in a lot of concepts into the context, but I guess not then?

It has also been said before that the text a model outputs during its Thinking step isn't actually a view into its inner thoughts. There are times when the model will think X but eventually answer Y.

But even so: the models _are_ better, right? So is the Thinking step then mostly useful during training?