Comment by pzmarzly

18 hours ago

10x cheaper price per token than Claude, am I reading it right?

As long as it doesn't mean 10x worse performance, that's a good selling point.

Something like GPT 5-mini is a lot cheaper than even Haiku but when I tried it in my experience it was so bad it was a waste of time. But it’s probably still more than 1/10 the performance of Haiku probably?

In work, where my employer pays for it, Haiku tends to be the workhorse with Sonnet or Opus when I see it flailing. On my own budget I’m a lot more cost conscious, so Haiku actually ends up being “the fancy model” and minimax m2 the “dumb model”.

Even if it is 10x cheaper and 2x worse it's going to eat up even more tokens spinning its wheels trying to implement things or squash bugs and you may end up spending more because of that. Or at least spending way more of your time.

  • The benchmark of swe places it in a comparable score with respect to open models and just a few points below the top notch models though

Is it? The actual SOTA are not amazing at coding, so at least for me there is absolutely no reason to optimize on price at the moment. If I am going to use an LLM for coding it makes little sense to settle for a worse coder.

  • I dunno. Even pretty weak models can be decently performant, and 9/10 the performance for 1/10 the price means 10x the output, and for a lot of stuff that quality difference dosent really matter. Considering even sota models are trash, slightly worse dosent really make that much difference.