Comment by doug_durham

21 hours ago

GLM-5.1 isn't just as good. It is no match for Opus running in Claude Code. Please try it yourself. Open source models are about a year behind at least.

This is profoundly misinformed. I use all three of those models regularly and the difference is just not that big anymore. GLM 5.1 is at least as good as Opus 4.5 - when it’s my dime it’s the primary model I use and switch to GPT 5.5 for planning and review but it’s also very capable at those things. If I had to pay API rates for everything there is no question I would only use GLM 5.1 (and Minimax for exploration tasks).

At work I mostly use Claude Code and a bit of Codex; personal projects are OpenCode and honestly I prefer it.

  • I would agree here. And in my experience Qwen 27B and Deepseek v4 are also extremely good.

    None of them are quite opus, but they are damned close and a no brainer if you care at all about cost.

In the second half of last year, I found that agentic coding with proprietary models (≈ vibe coding) reached the point where it actually speeds up my ability to deliver useful code at work. Before that, AI-based autocomplete definitely helped, but (despite the claims of the people selling AI coding tools) letting an agent author more than a file or so at a time (often a function or so at a time) required a very intricate plan or it would create a mess. Creating that plan or cleaning up the mess would take longer than just doing everything myself.

For me, it feels like widely available open models have recently crossed that same canyon. Are they as good as e.g. late-model Claude Opus? I don't think so. But they have absolutely gotten past the point where they are beneficial. This means that, for me, they are about six months behind.

  • Exactly this. GLM 5.1 is the first open model that I thought "actually worked" for agentic coding, which puts it in the same tier as Opus 4.5 - which was where I flipped.

For coding I wouldn't say a year, last year this time claude or gpt definitely weren't able to do what GLM is able to do today, but easily 6 months I'd say.

Not sure about other domains though.

I use composer-2 daily for complex programming tasks. It's a fine tuned Kimi 2.5 - nothing groundbreaking. I've even had reasonable success using Qwen 3.5 on my desktop GPU. Opus might be better, but it's certainly not necessary to get good results.