Comment by KronisLV

18 hours ago

For development use cases, I switched to Sonnet 4.5 and haven't looked back. I mean, sure, sometimes I also use GPT-5 (and mini) and Gemini 2.5 Pro (and Flash), and also Cerebras Code just switched to providing GLM 4.6 instead of the previous Qwen3 Coder so those as well, but in general the frontier models are pretty good for development and I wouldn't have much reason to use something like Sonnet 4 or 3.7 or whatever.

19 comments

KronisLV

JanSt 17 hours ago

I have canceled my Claude Max subscription because Sonnet 4.5 is just too unreliable. For the rest of the month I'm using Opus 4.1 which is much better but seems to have much lower usage limits than before Sonnet 4.5 was released. When I hit 4.1 Opus limits I'm using Codex. I will probably go through with the Codex pro subscription.

virtualritz 11 hours ago
> [...] I'm using Opus 4.1 which is much better but seems to have much lower usage limits than before Sonnet 4.5 was released [...]
Yes, it's down from 40h/week to 3-5h/week on Max plan, effectively. A real bummer. See my comment here [1] regarding [2].
[1] https://github.com/anthropics/claude-code/issues/8449
- thot_experiment 6 hours ago
  
  Glad I'm not imagining it, I'll be cancelling my sub. Paying for things only for them to get worse and the provider hoping I don't notice is such a fucking vile tactic.
  In my experience sonnet 4.5 is basically pointless, it often gets non-trivial tasks wrong, and for trivial tasks I can use a local model or one of the myriad of providers that give free inference.
  EDIT: Holy shit I read the github issue, fuck these people.
  > We highly recommend Sonnet 4.5 -- Opus uses rate limits faster, and is not as capable for coding tasks.
  They're just straight gaslighting us now lmao.
CuriouslyC 16 hours ago
Definitely do it. You get a lot of deep research, access to GPT5 Pro, Sora and the Codex limits are MUCH higher.
- lukan 14 hours ago
  
  Curious why this is downvoted? Wrong information?
  
  1 reply →
mccoyb 12 hours ago

Sonnet 4.5 is way worse than Opus 4.1 -- it's incredible that they claim it's their best coding model.
It's obvious if you've used the two models for any sort of complicated work.
Codex with GPT-5 codex (high thinking) is better than both by a long shot, but takes longer to work. I've fully switched to Codex, and I used Claude Code for the past ~4 months as a daily driver for various things.
I only reach for Sonnet now if Codex gets cagey about writing code -- then I let Sonnet rush ahead, and have Codex align the code with my overall plan.
prophesi 11 hours ago

Opus 4.1 is better, but imo not 5 to 6 times the price better.

ojosilva 3 hours ago

Yeah, I'm just going through the Cerebras migration at the moment.

It's a shame Cerebras completely dropped Qwen3 Coder's fast tool calling, short and instant responses, and better speed overall for GLM 4.6 thinking. Qwen3 is really good at hitting the tools first, then coming up with a well-grounded answer based on reality. Sometimes it's good when a model is Socratic: just knows it knows nothing.

GLM 4.6 on the other hand is more self-sufficient and if it sees it, and knows it, it thinks and thinks and finally just fixes it in one or two shots, so when you hit the jackpot, it probably an improvement over Q3C. But when it does not get it right, it digs itself into a hole larger than the Olympus Mons.

thw_9a83c 14 hours ago

For development use cases, it's best to use multiple models anyway. E.g. my favorite model is the Gemini 2.5 Pro, but there are certain cases where Qwen3 Coder gives much better results. (Gemini likes to overthink.) It's like having a team of competent developers provide their opinions. For important parts (security, efficiency, APIs), it's always good to get opinions from different sources.

kristianp 15 hours ago

What tool are you using to enable switching between so many models?

neurostimulant 4 minutes ago

[delayed]
KronisLV 13 hours ago

For local chat Jan seems okay, or OpenWebUI for something hosted. For IDE integrations some people enjoy Cline a bunch but RooCode also allows you to have multiple roles (like ask/code/debug/architect with different permissions e.g. no file changes with ask) and also preconfigured profiles for the various providers and models, so you can switch with a dropdown, even in the middle of a chat. There’s also an Orchestrator mode so I can use something smart for splitting up tasks into smaller chunks and a dumber but cheaper model for the execution. Aside from that, most of the APIs there seem OpenAI conformant so switching isn’t that conceptually difficult. Also if you wanna try a lot of different models you can try OpenRouter.
leoalho 11 hours ago

Octofy (https://octofy.ai), it also allows answering with multiple models for one prompt.
lolive 12 hours ago

Isn't Continue supposed to help you do that, in VSCode? https://marketplace.visualstudio.com/items?itemName=Continue...
kanzure 11 hours ago
You can also switch between models with aider https://aider.chat/
- thebigspacefuck 5 hours ago
  
  Aider hasn’t been updated much lately, seems to be dying
RamtinJ95 15 hours ago

Something like opencode probably, that’s what I have been using to freely and very easily switch between models and keep all my same workflows. It’s phenomenal really
dhumph 11 hours ago

Cline & Open router