Comment by SwellJoe

6 hours ago

The good news (for you and most everyone other than the current leading AI companies), the gap between the SOTA and the near-frontiers is getting smaller every week or two. The leading Chinese models are only a few months behind now (GLM 5.2 tickles the tail of GPT 5.3 or 5.4 and Opus 4.6, according to benchmarks and the vibes among heavy users who've spent some time with it), where they were a couple of years behind a year ago.

4.6 was released at the beginning of February, so if the Chinese models only "tickle its tail," that means they're >5 months behind.

  • That comparison is also misleading because Opus 4.6 was probably not Anthropic's frontier model.

    We got the first news about Mythos in March, so it is likely that it was already close to ready by the time Opus 4.6 was released.

    So the actual gap is the time elapsed between March (or April for the official announcement) and whenever Chinese models can match Mythos.

    • The post-training process of a model that size is months, though it "works" before that. It is a big chunky model before it's released to the world and probably does some amazing things, sometimes...but, it wasn't done (else why wouldn't they release it and soundly trounce their competitors). I would assume that Chinese AI companies have a pipeline and what we see is a couple/few months behind their newest model, as well. Like, the new base model is cooked, but they're still plating it for service.

      Why would Anthropic get the benefit of pre-release models counting toward their lead, if nobody else gets to count their pre-release models?

  • > The leading Chinese models are only a few months behind now

    • I hear that often, but what does that even mean? I am a great proponent of open weights models. I do believe they are the only reason we have not stagnated into a collusion of halting (public) model releases.

      But exactly which point in time is z.ai compared to claude.ai? Consistently bring "6 months behind" in an exponentially acellerating evolution means the gap is growing exponentially wider, not constant.

This is nonsense.

The gap between Chinese models and American frontier models is estimated at 10 months by Anthropic themselves, and it's growing.

China has no flywheel for long-form agentic traces like Claude Code and its telemetry over its userbase (no one uses the Chinese harnesses yet). Most Chinese models are forced to price themselves significantly below cost to compete with the huge demand for bootleg claude tokens, because they're that much worse.

  • Ah, well, if Anthropic says their competitors are ten months behind...

    I don't know what I was thinking.

  • > is estimated at 10 months by Anthropic themselves, and it's growing.

    How is this different than any business with something to lose saying a competitor isn't as good? Not saying it's false, but it would seem to me that it's more important how customers feel about the issue.

  • Here in Australia the sudden withdrawal of Fable made all of us think hard about models and harnesses.

    I've heard half a dozen people talk about how a less advanced model coupled with a better harness outperforms a smarter model in the last few weeks.

    If the USA wanted to shoot its AI industry in the foot it achieved its goal.

  • If Anthropic themselves say competition is 10 months behind, it's probably 5 or less.

    And you seem to think "no one uses" DeepSeek's v4, z.AI's GLM 5.2 or Xiaomi's MiMo 2.5 from their official APIs when they probably dwarf Anthropic's usage and are widening the gap due to conquering a chunk of Western market too.

    I know it's hard for some to comprehend there's an entire Eastern hemisphere in the globe with billions of people, so it's worth reminding. And some seem to think the world is basically silicon valley even.