← Back to context

Comment by lysace

5 months ago

Isn't the reasoning thing essentially a bolt-on to existing trained models? Like basically a meta-prompt?

No.

DeepSeek and now related projects have shown it’s possible to add reasoning via SFT to existing models, but that’s not the same as a prompt. But if you look at R1 they do a blend of techniques to get reasoning.

For Anthropic to have a hybrid model where you can control this, it will have to be built into the model directly in its training and probably architecture as well.

If you’re a competent company filled with the best AI minds and a frontier model, you’re not just purely copying… you’re taking ideas while innovating and adapting.

The fundamental innovation is training the model to reason through reinforcement learning; you can train existing models with traces from these reasoning models to get you within the same ballpark, but taking it further requires you to do RL yourself.