← Back to context

Comment by libraryofbabel

3 days ago

I am saying this probably is "silly behavior by a government" and it is a milestone that points towards what the future may look like. Why can't it be both?

It's easy to wave this aside as the current administration playing political games. But I don't think there is any reason to assume that the current era of open availability of models is going to continue indefinitely. Do you think that Chinese labs will continue to release open models forever, even why they get to the level that Mythos is at now, and beyond? And do you think that a competent US government would have no interest in regulating and restricting model access in 2 years time, assuming that model capabilities continue to improve? I think we bias towards thinking the status quo is the norm and will continue, but this news invites us to question that assumption and think about different ways the future could go.

> Do you think that Chinese labs will continue to release open models forever

Yes.

I think the Chinese government either already has, or will soon, grasp that if they train the models that people use they dictate what people believe (at least around the margins where that's malleable), and they will happily throw resources at that.

And simultaneously that the only way they can actually get everyone to use their models is if it's possible for us to run them on our own hardware.

(This isn't exactly a utopian view of the future)

  • This is going to age very poorly when the best Chinese labs ALREADY just started not open sourcing their models.

    Qwen 3.7 is not open source; previous Qwen versions would have open source releases, but Qwen 3.7 plus does not. The second best Chinese model, Minimax M3, is testing the waters by taking longer and longer between “model release” and open sourcing it. This time, they spent 2 weeks after release before open sourcing it. There’s also a lot of rumors of GLM and Deepseek not open sourcing future models.

    It’s pretty obvious that you cannot take Chinese models as open source for granted, they’ll be closed source soon.

    • If we're measuring progress in hours and days then yes. But if we're measuring progress in months then OSS models are doing fine. You can get a state-of-the art performance in an open model if you pretend it is January 2026 instead of June.

      There is no evidence here that the cutting edge labs have any durable advantage. Extrapolating current trends it seems likely that even the Europeans will be capable of meeting any given performance measure with enough time. In fact the evidence suggests that the capital required to run the models is where a moat will develop. Knowing the weights won't help much.

    • The best chinese models are deepseek (general purpose) and glm (coding) and they are both open weight and share lots of their tooling.

      There are lots of AI companies and it doesn’t seem that they all have the same funding fountain or share monetization goals. I wouldn’t read much into what each one of them is doing.

      9 replies →

    • The main reason the Chinese labs are releasing models as open weights is because they don't have the compute necessary to provide all of the inference. For the US frontier models something like 80-90% of the lifetime compute required for the model is inference rather than training. China wants to shepherd as much of their limited compute as possible towards training to keep up in the race.

      25 replies →

  • The US administration restricting the use of US-trained models is one of the best gifts it could make to the Chinese LLM producers, and to the PRC government.

    • This entire administration is a gift to everybody but the US. It’s either in service of Russia, China or whoever is willing to pay Trump the most.

      11 replies →

  • There's also the Meta motivation, that even if you don't get the control you would like from releasing a model, it may still be worth it to at least deny others that control. I'm sure that matters even more to China vs. the US than it mattered to Facebook vs. Google.

  • There is no moat in the model and by making the them open, it’s hard for one to be established when the free models are “good enough”.

    OpenAI and Anthropic are both hamstrung by this. Anthropic does have the better chance of surviving.

  • You don’t need the cutting edge to influence people’s opinion. “Export LLMs” to the rescue.

  • > I think the Chinese government either already has, or will soon, grasp that if they train the models that people use they dictate what people believe (at least around the margins where that's malleable), and they will happily throw resources at that.

    that doesn't require the model to be SOTA, it can be just a compact model capable of running on some inexpensive hardware. that is vastly different from SOTA models like Mythos which can potentially disrupt lots of things.

    • Of course it requires SOTA, people will always choose better models over some compact thing that is obviously more limited. You can't control the truth with models nobody wants to use.

      7 replies →

  • > > Do you think that Chinese labs will continue to release open models forever

    > Yes.

    holy shit the naivete of HN nowadays.

> Why can't it be both?

Is the government going to fund all further development? Hard to imagine investors continuing to throw billions at products they aren't allowed to sell.

  • Why wouldn't they? They see this technology as a military asset now.

    • Honestly, with the caliber of people who currently comprise the US administration; leaving the whole thing to Openclaw and some new fancy model might not be the worst idea.

      1 reply →

    • Trump and friends are only interested in investments they can personally make money from.

Yeah, there’s been a lot of debate about this on r/localllama — will there be a steady supply of new free/open models in the future?

And if not, can we simply keep augmenting “stale” models with new knowledge to keep them useful?

I’m on the pessimistic side of things on both questions.

As for the second question, obviously stale models can be augmented to an extent but it’s nowhere near a substitute for new knowledge being fully baked directly into its training.

> I am saying this probably is "silly behavior by a government" and it is a milestone that points towards what the future may look like. Why can't it be both?

Here is why it's unlikely this is anything other than "silly behavior by a government":

- some benchmarks show GPT-5.5, Gemini 3.1, and even Claude Opus outperforming Claude Fable, and yet it's Fable which is restricted.

- some benchmarks still show the likes of Kimi 2.5 outperforming any Claude model, and DeepSeek is getting equivalent scores (a few tenths of a percent difference)

> Do you think that Chinese labs will continue to release open models forever (...)

That's immaterial to the discussion. Even if China forced Chinese labs to restrict access to all models, the truth of the matter is that Trump's administration to restrict access to US-based models does not prevent others from having access to models that are as capable or even better.

So what's exactly the point of this?

  • You’re completely overrating these benchmarks and it’s landing you at a nonsense opinion. Just actually use the models and you will see that the gap is significant.

    • It should be easy for a company like Anthropic to prove this beyond a doubt. Why don't they? Why don't they have a collection of prompts and side-by-side comparisons with other models showing how far ahead they are?

      1 reply →

  • I got to try using Fable for a day... it was a clear and definite shift in quality and how independent it is.

    It was almost like having another human using and shepherding Opus for me, instead of herding Opus directly myself.

  • All that says is some benchmarks aren’t worth the tokens it takes to evaluate them. Mythos is clearly capable of finding zero days other models can’t, and Fable is close enough to be lumped with it.

    • > Mythos is clearly capable of finding zero days other models can’t

      I'm unconvinced that this is anything more than proof of work and marginal improvement that other models will catch up with, perhaps as early as to next week. Lots of other current-gen models will find vulns that can be chained together if you're willing to burn enough tokens on the task, and Fable is an absolute token incinerator.