Comment by dajonker

10 days ago

Wouldn't be surprised if they slowly start quantizing their models over time. Makes it easier to scale and reduce operational cost. Also makes a new release have more impact as it will be more notably "better" than what you've been using the past couple of days/weeks.

9 comments

dajonker

kilroy123 10 days ago

It sure feels like they do this. They claim they don't, but using it every day for 5-10 hours a day. You notice when something changes.

This last week it seems way dumber than before.

9cb14c1ec0 10 days ago

I don't think so. There are other knobs they can tweak to reduce load that affect quality less than quantizing. Like trimming the conversation length without telling you, reducing reasoning effort, etc.

mgraczyk 10 days ago
We never do anything that reduce model intelligence like that
- siva7 8 days ago
  
  You said "like that", ok but there may be some truth to reduced model intelligence. Also how AWS deployed Anthropic models for Amazons Kiro feel much dumber than those controlled entirely by Anthropic. Can't be just me

eli 10 days ago

I would be surprised tbh.

Anthropic does not exactly act like they're constrained by infra costs in other areas, and noticeably degrading a product when you're in tight competition with 1 or 2 other players with similar products seems like a bad place to start.

I think people just notice the flaws in these models more the longer they use them. Aka the "honeymoon-hangover effect," a real pattern that has been shown in a variety of real world situations.

kristianp 10 days ago

Open weights models such as GPT-OSS, Kimi K2.x are trained with 4 bit layers. So it wouldn't come as a surprise if the closed models do similar things. If I compare Kimi K2.5 and Opus 4.5 on openrouter, output tokens are about 8x more expensive for Opus, which might indicate Opus is much larger and doesn't quantize, but the claude subscription plans muddy the waters on price comparison a lot.

YetAnotherNick 10 days ago

Benchmarks like ARG AGI are super price correlated and cheap to run. I think it's very easy to prove that the models are degrading.

rustyhancock 10 days ago

Oooff yes I think that is exactly the kind of shenanigans they might pull.

Ultimately I can understand if a new model is coming in without as much optimization then it'll add pressure to the older models achieving the same result.

Nice plausible deniability for a convenient double effect.

Roark66 10 days ago

I haven't noticed much difference in Claude, but I swear gemini 3 pro preview was better in the first week or two and later started feeling like they quantized it down to hell.