Comment by knollimar

11 hours ago

Isn't it closer to sonnet?

8 comments

knollimar

The Chinese open weight models have been ahead of Sonnet (at least for coding) for a couple months now. I tend to take benchmarks with a huge grain of salt, but in my own experience, the latest versions of Kimi, MiMo, and GLM (pre-5.2) had already surpassed Sonnet in terms of output quality for a fraction of the price.

With that said, I'm excited to try GLM 5.2 because I still end up reaching for Opus and GPT 5.5 for many tasks because the open models tend to get stuck more often on complex problems.

knollimar 1 hour ago

I found sonnet preferable to k2.6 but 2.7 code for kimi seems better anecdotally

redox99 11 hours ago

Definitely opus level for coding.

smith7018 11 hours ago
Do you have benchmarks or at least anecdotes to back that up? I'm not arguing with you; I would just love to see some proof that open models are getting as good as Anthropic's models.
- redox99 10 hours ago
  
  I've been running some test prompts comparing frontier models for webdev, particularly pretty visualizations, physics / orbital simulations, etc.
  Do note that GLM is not multi modal, which can be a deal breaker. And these open models are not good outside coding.
- unrvl22 10 hours ago
  
  look at benchmarks, use the model yourself. Im usually first to call BS on every chinese model that says they are as good as Opus. this is finally the first one that actually is. It is a massive jump from every other previous chinese model.
  
  1 reply →
knollimar 7 hours ago

Oic I misremembered OAI scores, I thought Sonnet had 51