← Back to context

Comment by nl

7 hours ago

Kilo (the open source coding agent) tested Deepseek v4 Pro and Flash vs Opus 4.7 and Kimi K2[1].

It did ok, but scored substantially less than Opus. It also cost nearly as much, even with the current launch promo pricing for Deepseek.

That cost is interesting - I've seen similar things with Sonnet vs Opus, and in my own benchmarking there are some models that benchmark well, seem to have a good price but use so many tokens they cost just as much as "more expensive" models.

[1] https://blog.kilo.ai/p/we-tested-deepseek-v4-pro-and-flash

Their pricing shown is without the discount.

> With DeepSeek’s 75% promo applied to current rates, the same run would have cost closer to $0.55, putting it below Kimi K2.6 in absolute cost while scoring 9 points higher.

I will be sad when the discount ends.