Comment by zone411

14 days ago

It's interesting that there are no reasoning models yet, 2.5 months after DeepSeek R1. It definitely looks like R1 surprised them. The released benchmarks look good.

Large context windows will definitely be the trend in upcoming model releases. I'll soon be adding a new benchmark to test this more effectively than needle-in-a-haystack (there are already a couple of benchmarks that do that).

All these models are very large, it will be tough for enthusiasts to run them locally.

The license is still quite restrictive. I can see why some might think it doesn't qualify as open source.

> It's interesting that there are no reasoning models yet

This may be merely a naming distinction, leaving the name open for a future release based on their recent research such as coconut[1]. They did RL post-training, and when fed logic problems it appears to do significant amounts of step-by-step thinking[2]. It seems it just doesn't wrap it in <thinking> tags.

[1] https://arxiv.org/abs/2412.06769 "Training Large Language Models to Reason in a Continuous Latent Space" [2] https://www.youtube.com/watch?v=12lAM-xPvu8 (skip through this - it's recorded in real time)

But if the final result is of high enough quality, who cares about reasoning? It’s a trick to get the quality higher, at the cost of tokens and latency.

  • reasoning is giving the option to trade $ for additional performance, seems like you would always desire this optionality for any model