← Back to context

Comment by KronisLV

18 hours ago

For development use cases, I switched to Sonnet 4.5 and haven't looked back. I mean, sure, sometimes I also use GPT-5 (and mini) and Gemini 2.5 Pro (and Flash), and also Cerebras Code just switched to providing GLM 4.6 instead of the previous Qwen3 Coder so those as well, but in general the frontier models are pretty good for development and I wouldn't have much reason to use something like Sonnet 4 or 3.7 or whatever.

I have canceled my Claude Max subscription because Sonnet 4.5 is just too unreliable. For the rest of the month I'm using Opus 4.1 which is much better but seems to have much lower usage limits than before Sonnet 4.5 was released. When I hit 4.1 Opus limits I'm using Codex. I will probably go through with the Codex pro subscription.

  • > [...] I'm using Opus 4.1 which is much better but seems to have much lower usage limits than before Sonnet 4.5 was released [...]

    Yes, it's down from 40h/week to 3-5h/week on Max plan, effectively. A real bummer. See my comment here [1] regarding [2].

    [1] https://github.com/anthropics/claude-code/issues/8449

    • Glad I'm not imagining it, I'll be cancelling my sub. Paying for things only for them to get worse and the provider hoping I don't notice is such a fucking vile tactic.

      In my experience sonnet 4.5 is basically pointless, it often gets non-trivial tasks wrong, and for trivial tasks I can use a local model or one of the myriad of providers that give free inference.

      EDIT: Holy shit I read the github issue, fuck these people.

      > We highly recommend Sonnet 4.5 -- Opus uses rate limits faster, and is not as capable for coding tasks.

      They're just straight gaslighting us now lmao.

  • Sonnet 4.5 is way worse than Opus 4.1 -- it's incredible that they claim it's their best coding model.

    It's obvious if you've used the two models for any sort of complicated work.

    Codex with GPT-5 codex (high thinking) is better than both by a long shot, but takes longer to work. I've fully switched to Codex, and I used Claude Code for the past ~4 months as a daily driver for various things.

    I only reach for Sonnet now if Codex gets cagey about writing code -- then I let Sonnet rush ahead, and have Codex align the code with my overall plan.

Yeah, I'm just going through the Cerebras migration at the moment.

It's a shame Cerebras completely dropped Qwen3 Coder's fast tool calling, short and instant responses, and better speed overall for GLM 4.6 thinking. Qwen3 is really good at hitting the tools first, then coming up with a well-grounded answer based on reality. Sometimes it's good when a model is Socratic: just knows it knows nothing.

GLM 4.6 on the other hand is more self-sufficient and if it sees it, and knows it, it thinks and thinks and finally just fixes it in one or two shots, so when you hit the jackpot, it probably an improvement over Q3C. But when it does not get it right, it digs itself into a hole larger than the Olympus Mons.

For development use cases, it's best to use multiple models anyway. E.g. my favorite model is the Gemini 2.5 Pro, but there are certain cases where Qwen3 Coder gives much better results. (Gemini likes to overthink.) It's like having a team of competent developers provide their opinions. For important parts (security, efficiency, APIs), it's always good to get opinions from different sources.

What tool are you using to enable switching between so many models?

  • For local chat Jan seems okay, or OpenWebUI for something hosted. For IDE integrations some people enjoy Cline a bunch but RooCode also allows you to have multiple roles (like ask/code/debug/architect with different permissions e.g. no file changes with ask) and also preconfigured profiles for the various providers and models, so you can switch with a dropdown, even in the middle of a chat. There’s also an Orchestrator mode so I can use something smart for splitting up tasks into smaller chunks and a dumber but cheaper model for the execution. Aside from that, most of the APIs there seem OpenAI conformant so switching isn’t that conceptually difficult. Also if you wanna try a lot of different models you can try OpenRouter.

  • Something like opencode probably, that’s what I have been using to freely and very easily switch between models and keep all my same workflows. It’s phenomenal really