Real world usage suggests otherwise. It's been a known trend for a while. Anthropic even confirmed as such ~6 months ago but said it was a "bug" - one that somehow just keeps happening 4-6 months after a model is released.
They do. I'm currently seeing a degradation on Opus 4.6 on tasks it could do without trouble a few months back. Obvious I'm a sample of n=1, but I'm also convinced a new model is around the corner and they preemptively nerf their current model so people notice the "improvement".
I think the conspiracy theories are silly, but equally I think pretending these black boxes are completely stable once they're released is incorrect as well.
Opus 4.5 ~= Opus 4.6 high. Opus 4.5 was nerfed just before or after the release of 4.6.
The models don’t change.
On paper. There's huge financial incentive to quantize the crap out of a good model to save cash after you've hooked in subscriptions.
10 replies →
Real world usage suggests otherwise. It's been a known trend for a while. Anthropic even confirmed as such ~6 months ago but said it was a "bug" - one that somehow just keeps happening 4-6 months after a model is released.
3 replies →
You sure about that?
https://marginlab.ai/trackers/claude-code/
1 reply →
They do. I'm currently seeing a degradation on Opus 4.6 on tasks it could do without trouble a few months back. Obvious I'm a sample of n=1, but I'm also convinced a new model is around the corner and they preemptively nerf their current model so people notice the "improvement".
2 replies →
Only nominally...
Oh yes, they do.
I think the conspiracy theories are silly, but equally I think pretending these black boxes are completely stable once they're released is incorrect as well.
1 reply →