Comment by nojs
2 days ago
Surprisingly my experience has been the opposite with qwen, if you can force the thinking trace to English the results seem better. But probably just due to the amount of training data.
2 days ago
Surprisingly my experience has been the opposite with qwen, if you can force the thinking trace to English the results seem better. But probably just due to the amount of training data.
No comments yet
Contribute on Hacker News ↗