← Back to context

Comment by ls612

3 days ago

The main reason the Chinese labs are releasing models as open weights is because they don't have the compute necessary to provide all of the inference. For the US frontier models something like 80-90% of the lifetime compute required for the model is inference rather than training. China wants to shepherd as much of their limited compute as possible towards training to keep up in the race.

I think the main reason is to minimize the market for closed-source models from US companies.

China knows that doing what Anthropic/OpenAI/Google/... are doing is impossible for them. No one outside of China in any sane condition will send their data to compute farms IN CHINA like people currently do with US-based frontier models. Even if they could muster the inference power.

Hence they do the second-best thing possible to attack the dominance of the US-based corporations: reduce their moat by open-sourcing models that are not fully equal, but practically useful and good enough for easily 90% of typical tasks people use agents for in their daily lives. But way cheaper to run.

As long as this arms race in AI continues, China as "number two" will have some incentive to continue open-sourcing models. But of course the US government might force a change if they continue to enforce limited public access to new frontier models - there is no market to minimize if a model is not allowed to be publicly available.

  • I'm European and I don't see sending my data to China as more risky than sending it to the US. Rather the opposite.

    I think your vision of how the rest of the US sees the world is tinted by a massive bias.

    • As a private citizen, yes.

      But at work the calculus is entirely different. There is already lots of exposure to US companies (guess where our emails and tickets life), so the increase in espionage risk from adding another American company is small. Not zero, and trust towards AI companies is limited. But adding the first Chinese company to send data to would be a major risk. One nobody would sign off on, given the general reputation of the Chinese economy for widespread espionage, disregard for copyright and producing copies of successful products using insider information

      9 replies →

    • Totally agree, though it is an unpopular opinion here.

      It’s the same paradox as people claiming: “we are European, our data is safer in Europe” when actually your privacy is higher when your data is stored in China (or Russia) you are safer because it is out of reach from your local government.

      The only thing I dislike, and that’s no matter the service, is that my data or information usage is shared with third-party.

      For example, Anthropic conveniently forgets to mention Datadog has tons and tons of information about Claude users, or that your data transits through machines they don’t operate.

      4 replies →

  • was going to say this.. open sourcing Chinese models will enforce Chinese dominance instead of reducing it. When an open Chinese model becomes the best alternative to inaccessible closed US models guess what everybody will start to use. And that same open model may embed certain narratives and values that please the Chinese government.

  • Ya. You know enough about China to know: would they be willing to sell users outside of China models that aren't fully pro-China (and won't deflect on tough questions)? Or would that be dirty money that they wouldn't want anyone to make?

    Like if they could release Ch-ythos 6 tomorrow BUT it had Western ideals, would they take the fame, clout, attention, & profit, or stick to the party line?

    (hope the monolithic brush is appropriate, considering, I mean it's an impressive system/country even if I have my own strong preferences - also I've taken as true reporting about their models deflecting etc. on sensitive topics)

    • Sounds perfect, sell it to me.

      I use LLMs for health, design and programming.

      If you want to make a political or religious pamphlet it’s not a single LLM that you should base yourself on. No matter where it comes from.

With nearly everyone using inference accelerators, the pool of hardware is no longer shared between training and use.