Comment by Sol-
1 day ago
I don't want to be overly cynical and am in general in favor of the contrarian attitude of simply taking people at their word, but I wonder if their current struggles with compute resources make it easier for them to choose to not deploy Mythos widely. I can imagine their safety argument is real, but regardless, they might not have the resources to profitably deploy it. (Though on the other hand, you could argue that they could always simply charge more.)
I would have not believed your argument 3 months ago but I strongly suspect Anthropic actively engages in model quality throttling due to their compute constraints. Their recent deal for multi GWs worth of data center might help them correct their approach.
For what it's worth Anthropic explicity denies that. "To state it plainly: We never reduce model quality due to demand, time of day, or server load"
Also can see https://marginlab.ai/trackers/claude-code/
It's very interesting to me how widespread this conception is. Maybe it's as simple as LLM productivity degrading over time within a project, as slop compounds.
Or more recently since they added a 1m context window, maybe people are more reckless with context usage
It has nothing to do with the context window. Reasoning brought measured approaches grounded with actual tool calls. All of that short-circuits into a quick fix approach that is unlike Opus-4.5 or 4.6. Sonnet-4.5 used to do that. My context window is always < 200K.
That still leaves open the possibility that they reduce model quality due to profit. ;p
Posted this a while ago:
>Models are not "degrading". They're not being "secretly quantized". And no one is swapping out your 1.2T frontier behemoth for a cheap 120B toy and hoping you wouldn't notice!
>It's just that humans are completely full of shit, and can't be trusted to measure LLM performance objectively!
>Every time you use an LLM, you learn its capability profile better. You start using it more aggressively at what it's "good" at, until you find the limits and expose the flaws. You start paying attention to the more subtle issues you overlooked at first. Your honeymoon period wears off and you see that "the model got dumber". It didn't. You got better at pushing it to its limits, exposing the ways in which it was always dumb.
>Now, will the likes of Anthropic just "API error: overloaded" you on any day of the week that ends in Y? Will they reduce your usage quotas and hope that you don't notice because they never gave you a number anyway? Oh, definitely. But that "they're making the models WORSE" bullshit lives in people's heads way more than in any reality.
Inference is where they make the money they spend on training, so this feels unlikely. Perhaps this does not true for Mythos though