Comment by jillesvangurp

20 hours ago

Most of the corporate world in the EU or North America will be hesitant to rely on Chinese AI providers. There are some very real blockers for that for things like data security, compliance, etc. And recent geopolitics don't help.

Legalities aside, you need to look not at the model quality but at the infrastructure needed to scale these models from tens (now) to hundreds (soon) of millions of users. Only a handful of companies actually have the resources and funding to do that. That's what these huge valuations are based on. These companies are gearing up to scale to these levels. That's why they are spending on data centers. Whoever has access to those data centers gets to tap into the revenue stream of people using models running on those.

The market for frontier models is roughly split between OpenAI, Anthropic, and Google. And then you have companies like X/SpaceX, Amazon, and Microsoft being more successful with their infrastructure than their AI products and companies like Apple, Meta that have the money and the aspiration but are so far not really managing to be very successful with their AI strategies.

Deepseek is just very poorly positioned to capture a lot of the enterprise revenue in the EU or North America. But they might become very dominant outside the US/EU. And of course China itself is going to be a huge market and equally unlikely to want to be depending on US owner AI companies.

Deepseek and all the other Chinese models have open-weights. You can host them yourself, no need to send data to China or rely on them.

  • There is still a risk of supply-chain attack. People give LLMs direct access to their entire infrastructure via tools, and never check the code produced. It's not difficult to steer an LLM during training so that they'd output malware only when prompted a certain way, and that wouldn't come up during the initial evaluation.

    Personally I see no difference between China and America in terms of risks of them embedding "backdoors" so to speak, but I disagree when people claim that open-weight models are obviously safe just because they can be ran locally.

    • > It's not difficult to steer an LLM during training so that they'd output malware only when prompted a certain way

      Perhaps, but that's also a good way to lose users+reputation as there's no way to control when said malware is generated. Once the first instance is discovered cybersec researchers will have a field day reproducing it and showing the world.

  • It is not a trivial challenge setting up model serving infra for ~1T or larger models, especially in a high reliability environment (e.g. your team is using it for work, or you're using it to power production apps). Sure, there are third party providers, although the quality and reliability of their inference varies.

Run Deepseek on Deepinfra then? Or Fireworks if US-based is important. None of these are real issues outside maybe convincing your legal team to do a bit of homework.

  • I don't think you are appreciating the physical constraints here. Deepseek doesn't really have the hardware in the US or EU to do anything at scale.

    Sure, you can self host a non-frontier OSS model yourself; including Deepseek. And no doubt some people will pay one of the companies I mentioned to rent the infrastructure to do exactly that. Much of the rest of the world will be paying directly for direct access to the frontier models.

    As for the legal/compliance stuff, I recommend you don't take any big decisions on that front without consulting lawyers. My understanding of that is that most serious companies in the EU have to take these topics pretty seriously. I'm sure in the US, hosting all your data and secrets in Chinese data centers isn't a whole lot less controversial.

    The Chinese could of course choose try to match the current levels of investment Google, OpenAI, Anthropic, etc. are putting into local infrastructure. But as far as I know they aren't and there are probably a few political blockers for that.

    Without infrastructure, their role is being a niche player in these markets. It doesn't really matter how good they are if they can't scale to most of the market.

    • I mean, Western providers such as Fireworks AI/Microsoft Foundry (US) or Tensorix (EU) already are offering many of these models on their own hardware with all the typical compliance boxes ticked through a standard API. Either as open weight models or through partnerships with Chinese firms, or both. DeepSeek etc. do not have to do anything here other than making their models available to Western partners (either as open weights or through a licensing agreement).