← Back to context

Comment by zonghao

2 days ago

I think this might be why, during the reasoning process of GPT and Gemini, even for purely English prompts the model may choose to think in Chinese. That may make it easier for the model to express what it means, and thus be more conducive to its reasoning. Of course, a better way to reason is to think in vector space rather than by producing tokens that humans can read.

Surprisingly my experience has been the opposite with qwen, if you can force the thinking trace to English the results seem better. But probably just due to the amount of training data.