← Back to context

Comment by BoumTAC

4 hours ago

To me, mini releases matter much more and better reflect the real progress than SOTA models.

The frontier models have become so good that it's getting almost impossible to notice meaningful differences between them.

Meanwhile, when a smaller / less powerful model releases a new version, the jump in quality is often massive, to the point where we can now use them 100% of the time in many cases.

And since they're also getting dramatically cheaper, it's becoming increasingly compelling to actually run these models in real-life applications.

If you're doing something common then maybe there are no differences with SOTA. But I've noticed a few. GPT 5.4 isn't as good at UI work in svelte. Gemini tends to go off and implement stuff even if I prompt it to discuss but it's pretty good at UI code. Claude tends to find out less about my code base than GPT and it abuses the any type in typescript.

> And since they're also getting dramatically cheaper, it's becoming increasingly compelling to actually run these models in real-life applications.

They're not really cheaper than the SOTA open models on third-party inference platforms, and they're generally dumber. I suppose they're still worth it if you must minimize latency for any given level of smarts, but not really otherwise.

they do are cheaper than SOTA but not getting dramatically cheaper but actually the opposite - GPT 5.4 mini is around ~3x more expensive than GPT 5.0 mini.

Similarly gemini 3.1 flash lite got more expensive than gemini 2.5 flash lite.

  • But they are getting dramatically better.

    What's the point of a crazy cheap model if it's shit ?

    I code most of the time with haiku 4.5 because it's so good. It's cheaper for me than buying a 23€ subscription from Anthropic.

    • The crazy cheap models may be adequate for a task, and low cost matters with volume. I need to label millions of images to determine if they're sexually suggestive (this includes but is not limited to nudity). The Gemini 2.0 Flash Lite model is inexpensive and performs well. Gemini 2.5 Flash Lite is also good, but not noticeably better, and it costs more. When 2.0 gets retired this June my costs are going up.