← Back to context

Comment by jclardy

10 hours ago

Just anecdotal, but I was using Claude Code for everything a few months ago, and it seemed great. Now, it is making a ton of mistakes, doing the wrong thing, misunderstanding context, and just generally being unusable.

I now have been using Codex and everything has been great (I still swap back and forth but generally to check things out.)

My theory is just that the models are great after release to get people switching, then they cut them back in capabilities slowly over time until the next major release to increase the hype cycle.

Is it the models themselves or the tools around them? There's that patch[1] that floats around for Claude Code that's supposed to solve a lot of these problems by adjusting its tool-level prompts. Also, if it were the models themselves, wouldn't Cursor users have the same complaints (do they? I haven't heard anything but the only Cursor users I talk to are coworkers)?

I think it's more likely they're trying to optimize the Claude Code prompts to reduce load on their system and have overcorrected at the cost of quality.

1: https://gist.github.com/roman01la/483d1db15043018096ac3babf5...

Yeah, shorter time frame but I've been noticing that too. Just the other day I was experimenting with some workflow stuff. "Do x and y and run tests and then merge into develop."

Duly runs, and finishes. "All merged into develop".

I do some other work, don't see any of this, double check myself, I'm working off of develop.

"Hey, where is this work?"

"It is in this branch and this worktree, as you would expect, you will need to merge into develop."

"I'm confused, I asked you to do that and you said it was done."

"You're right and I did say that but I didn't do it. Shall I do it now?"

There's like this really weird balancing act between managing usage, but making people burn more tokens...