Comment by teaearlgraycold

6 days ago

Personally I've found these bigger models (o3/Claude 4 Opus) to be disappointing for coding.

9 comments

teaearlgraycold

Opus is really great but through Claude Code. If you used Cursor or RooCode it could be normal that you get disappointed

bitpush 6 days ago
This matches my experience, but cant explain it. Do you know what's going on?
- eunoia 6 days ago
  
  My understanding is context size. Companies like Cursor are trying to minimize the amount of context sent to the models to keep their own costs down. Claude Code seems to send a lot more context with every request and that seems to make the difference.
- supermdguy 6 days ago
  
  Just guessing, but the new Opus was probably RL tuned to work better with Claude Code's tool calls
jedisct1 6 days ago
I got the opposite experience. Not with Opus (too expensive), but with Sonnet. I got things done way more efficiently when using Sonnet with Roo than with Claude Code.
- rgbrenner 6 days ago
  
  same. i ran a few tests ($100 worth of api calls) with opus 4 and didn’t see any difference compared to sonnet 4 other than the price.
  also no idea why he thinks roo is handicapped when claude code nerfs the thinking output and requires typing “think”/think hard/think harder/ultrathink just to expand the max thinking tokens.. which on ultrathink only sets it at 32k… when the max in roo is 51200 and it’s just a setting.
  
  2 replies →

apwell23 6 days ago

i found them all disappointing in their own ways. Atleast deepseek models actually listen to what i say instead of ignoring me doing their own thing like a toddler.