Comment by pzmarzly

2 months ago

10x cheaper price per token than Claude, am I reading it right?

As long as it doesn't mean 10x worse performance, that's a good selling point.

7 comments

pzmarzly

Something like GPT 5-mini is a lot cheaper than even Haiku but when I tried it in my experience it was so bad it was a waste of time. But it’s probably still more than 1/10 the performance of Haiku probably?

In work, where my employer pays for it, Haiku tends to be the workhorse with Sonnet or Opus when I see it flailing. On my own budget I’m a lot more cost conscious, so Haiku actually ends up being “the fancy model” and minimax m2 the “dumb model”.

phildougherty 2 months ago

Even if it is 10x cheaper and 2x worse it's going to eat up even more tokens spinning its wheels trying to implement things or squash bugs and you may end up spending more because of that. Or at least spending way more of your time.

amarcheschi 2 months ago

The benchmark of swe places it in a comparable score with respect to open models and just a few points below the top notch models though

fastball 2 months ago

Is it? The actual SOTA are not amazing at coding, so at least for me there is absolutely no reason to optimize on price at the moment. If I am going to use an LLM for coding it makes little sense to settle for a worse coder.

gunalx 2 months ago
I dunno. Even pretty weak models can be decently performant, and 9/10 the performance for 1/10 the price means 10x the output, and for a lot of stuff that quality difference dosent really matter. Considering even sota models are trash, slightly worse dosent really make that much difference.
- fastball 2 months ago
  
  > SOTA models are "trash"
  > this model is worse (but cheaper)
  > use it to output 10x the amount of trashier trash
  You've lost me.
  
  1 reply →