← Back to context

Comment by cbg0

3 days ago

The open models are still far behind GPT 5.4 and Claude if you're using them for building software.

I don't think people realize how irrational that argument is (that SOTA is better, so you have to use SOTA).

Open weights will always trail SOTA. Forever. So let's say they continue to get better every year. In 100 years, the open weight model will be 100x better than today. But the SOTA model will be 101x better. And still, people will make this argument that you should pay a premium for SOTA. Despite the open weights being 100x better than what we have today.

The open weights today are better than the SOTA models from a year ago. Yet people were using the SOTA models for coding a year ago. If people used SOTA models a year ago, then it was good enough, right? So why isn't the same (or better) good enough now?

The answer is: it is good enough. But people are irrationally afraid of missing out (FOMO). They're not really using their brains. They're letting fear lead their decisions. They're afraid "something bad" will happen if they don't use the absolute latest model. Despite the repeatable, objective benchmarks telling us all that open weights are perfectly capable of doing real work today, the fear is that we're missing out on something better. So people throw away their money and struggle with rate-limits because of their fear.

  • I doubt there's any FOMO to it from people using it for professional coding for example, you just want to have access to the best model as long as the value proposition is there. If I spend less time reviewing the work done because it's great in the first pass, that's always a win.

    Unless you're an enthusiast and have the hardware to power larger models, $20 a month provides tremendous value.

    • Most of the people paying for the subscriptions have no idea how much, if any, difference the SOTA model is from non-SOTA. They don't even know if they're actually using the max capacity of their subscription, until they run over it. When they do run over it, was it because they were doing tons of AI work? Or was it because they were using expensive models? None of these people are checking any of these things. They're just afraid if they don't buy the best, they will be missing out. aka, FOMO.

About a year behind , TBQH. Newer Mixture-of-Experts models are comparable to a slightly older Claude Sonnet; if you don't mind the (lack of) speed. Some benchmarks say they're competitive with the frontier models right now for certain tasks.

I'm not sure how much I trust those benchmarks; I have a feeling everyone is playing up to them in some way. Still, if you're willing to accept the latency, they're definitely usable.

Of course everyone has realized this, so the hardware you need to run them is a little bit on the expensive side right this minute.

CPU manufacturers are working on improvements so that you can more practically run models on regular CPU+RAM (it's already possible with llama.cpp, just even slower).