Comment by voxgen
13 days ago
> It's interesting that there are no reasoning models yet
This may be merely a naming distinction, leaving the name open for a future release based on their recent research such as coconut[1]. They did RL post-training, and when fed logic problems it appears to do significant amounts of step-by-step thinking[2]. It seems it just doesn't wrap it in <thinking> tags.
[1] https://arxiv.org/abs/2412.06769 "Training Large Language Models to Reason in a Continuous Latent Space" [2] https://www.youtube.com/watch?v=12lAM-xPvu8 (skip through this - it's recorded in real time)
No comments yet
Contribute on Hacker News ↗