← Back to context

Comment by cuuupid

5 days ago

Not missing the forest for the trees, this effectively means in 3-5 months China will drop open source models that are every bit as capable and dangerous as current day Mythos except with no safeguards.

And the only companies safe from this are the large corporations that shook hands with Anthropic? Because Fable doesn't seem to have actual safeguards, more like 'if you talk about this you will be talking to Opus.' It doesn't guard against offensive use, it prevents all use (offensive AND defensive).

Rationalists are inventing oligopolies from first principles, absolutely incredible things happening in SF

My bet is that Mythos is still over-hyped and the cybersecurity fear and guardrails are mostly marketing to force company partnerships through Glasswing and get public attention.

It's not even very usable... I tried 2 different chats and both eventually got stopped due to the safeguards

One was a piece of code I gave it to improve, it did so and then started writing tests, some of which tested security so the safeguards triggered

Another was one of the cryptography puzzles I use as new model tests, which are hard to oneshot and there's no public solution anywhere, it completely refused to even try to solve it

  • I tried 2 chats and it declined both.

    - 1st chat asked about a minor shoulder injury most likely mechanisms

    - 2nd chat asked about optimal bloodwork testing markers

    • it seems to dislike biological chats. Rejected me on a chat that I am running with 4.8 as well on a rare condition I have.

  • So the degradation to Opus 4.8 from the article isn't happening in practice?

    • No, you get a AUP violation and have to manually swap the model

      (I had same issue, just asked it to check some code that 4.8 had modified earlier in day)

    • It is, it asks you if you want to continue as opus 4.8… but I was trying precisely to evaluate fable

  • Oh joy. A model whose safeguards make it prone towards code that make your systems less safe. How brilliant!

They're trained in a model class likely in 2t to 3t range. It's very unlikely that chinese labs have access to gpu systems capable of training models like that, let alone serving them. This requires proprietary room-scale systems which fetch a huge premium over typical 10 slot systems.

I am sure that they can develop their own equivlient version of such clusters in around 1 year though. Distilling fabel 5 will also go a long way.

  • DSv4 is nearly in the 2t range, but yes you're generally right

    • MoE experts were likely trained independently / in a sparse format. Training anything beyond 2t on typical systems would be infuriantingly slow, you could do 4t on nvidias room-scale solution, but for a reasonable training speed / batch size it caps around 3t.

      2 replies →

  • Ah, American Hubris ... I don't blame you, Hollywood is the world's greatest propaganda machinery of all times.

I think we're about to see a big relative drop-off of open models vs closed. I don't think there'll be an open model that competes with Mythos for ~2 years.

Even OpenAI and Google are struggling to get this kind of performance. If the distillation defenses are any good + chip controls prevent China from training massive models, it's over.

  • I think the Chinese have identified this gap and are working overtime on sovereign inference tech including chips.

    • They have, but even with the whole CCP backing you you can't just catch up on the chip war overnight. It's going to take time to get their memory and compute industries where they need to be. Meanwhile, barring an invasion of Taiwan, US will have Rubin class models and then whatever the next tier is, within 3 years.

      2 replies →

I wonder if model distillation will continue to work as well as it has. Given hidden reasoning, the ever expanding number of expected capabilities, a serious compute shortage, the looming possibility of model collapse, and dramatically higher API costs I would guess that it's getting much harder to do.

  • You should check out some Chinese forums. There are services selling gateways/proxies for all major models at fraction of the official rates. Likely reselling subscriptions, or some other form of abuse.

    I've seen people posting screenshots of billions of tokens consumed where they paid next to nothing.

    These same gateways are likely also reselling the data to Chinese labs, because TLS has to terminate at the gateway level.

  • Asian labs generated synthetic datasets from UBS labs but also innovated with technology. Now it is harder to get the thinking traces AND Anthropic is recorded to poison it as well.

    Thus Asian labs will have to generate their own data sets, which with the huuuuge usage boom from deepseek, mimo, kimi, etc, they will be able to.

There's also a reality where China does develop Mythos-level model but stops releasing the weights.

That reality is much scarier.

  • That's the reality China already lives in. Their weapon against US companies is commoditizing them, eliminating their moats and their profits by going open weights.

    Same thing Meta was doing before they fell behind.

    • > Same thing Meta was doing before they fell behind.

      Obviously unrelated to the OP, but it's crazy to me how incompetent Meta is at everything new they try to do.

      They burned billions of dollars on the most ridiculous project one could ever think of - somehow thinking that VR is the future.

      Then they did catch the initial wave of actual future with AI, they were at the forefront of open weight models - and failed at that too.

      What is even happening there?

      2 replies →

My experience is that open weight models from China are at least ~12 months behind. In some workloads they may be closer, in others further away.

I also find that the harness and product you wrap around models can often narrow that gap considerably.

Opus 4.6 for example, on a PR-for-PR basis was head and shoulders above GLM 5.1. Perhaps GLM 5.1 was a bit under Sonnet 4.6 at the time. That's roughly a year or so behind.

Much cheaper though! I'm bullish on open weight models, I have no idea where all these curves will top out, can the frontier labs keep the year plus lead? Do open labs get close enough to SOTA that they gain adoption across many tasks and drive down inference prices??? Who knows, not me.

I wonder where the trees are. In this thread nobody appears to actually be talking about the model.

  • Yeah, because it's impossible. You can't ask it anything about the thing that it's known for. It will not even answer a sky-high level question about reverse engineering, for example.

    In CC, it will probably report you to authorities if you ask it to do a vulnerability scan of your codebase.

Isn't that a good thing in a way? If everyone has the weapon and defense at the same time, we will fix security holes and live safer lifes instead of having some three letter agencies and military backdoors in everything.

Pandora box is open anyway. It's better now for everyone to have the same power rather than a few national states.

  • Not sure this holds, sadly. I spent a few months reporting serious security bugs as model capabilities took off earlier this year, and only ~half were fixed. The unfixed bugs were just as critical as the fixed ones; sometimes they were even two similarly critical bugs at the same company, and only one would be fixed!

    On your other point, the government still has systemic leverage and can compel access, so this doesn't remove that risk.

    That doesn't mean this is the end of the world, and some balance of power is usually good. But I do think it will still increase the capabilties of rogue actors and their net harm.

It's more evidence that the future is local. With some time we'll all be running highly capable & efficient open-source models on dedicated NPUs. No censorship, no rate limits, no overpriced subscriptions.

Oh they might try to put in place safeguards, but Qwen has had no problem being abliterated

3-5 months is a long time and they are pretty useless on arrival because the frontier models are so good, that it's hard to go back even if it's way cheaper. Your work flow is adapted to that level of intelligence for months.

  • That doesn't match my experience at all. I can't see myself saying in 6 months that the current model I am using is useless, that makes no sense.

    In fact, I did go back to DeepSeek V4 Flash for most of my problems as it is way cheaper and there is no need to use SOTA for absolutely everything.

> every bit as capable and dangerous as current day Mythos except with no safeguards

Not quite. They will definitely have "no criticism of China/communism" safeguards.

Oh please let’s stop with the Mythos “it’s dangerous” PR talk.

Its obvious Anthropic used it to hype things up and that’s about it.

> Rationalists are inventing oligopolies from first principles, absolutely incredible things happening in SF.

Based.

I don't think China has any incentive to arm the rest of the world with highly capable models that can be used against them. Undoubtedly they will continue with the arms race, but they will preserve the best stuff for their own use.

  • I think the stronger incentive is undermining/undercutting the Western AI companies. Given what we have seen, any model can be used/convinced to do harm so that is just part of the game

    • I agree, depending on how much of this is marketing and how much is actual capability. It's one thing to undercut models that finish writing assignments for lazy students. If this actually identifies vulns and writes exploits, or if it designs bioweapons, those are pretty different. Those are actual weapons, and I don't think they're going to arm the adversary.

  • A specific strategy is to arm absolutely everyone with very capable models, thus eliminating any advantage the U.S. could get from frontier AI.