I am no-where near as concerned by this as I was a year ago, when I was expecting the axe to fall at any moment before the Chinese labs achieved some sort of escape velocity. I now think it's too late, all the cats are out of all the bags, there's no moat except maybe a temporal one of a few months, the genie is out of the bottle.
There is no secret sauce the US labs have that the Chinese ones don't, or won't have soon enough. Deepseek 4 and Kimi 2.5 are not quite Claude 4.5/GPT5.5 but there's no fundamental principle missing - they are strong evidence that there's no real advantage the "frontier" labs possess that isn't related to scale, which they will gain in time (if they even need to). The RL post-training techniques that work are widely known and easily copied. All Deepseek is really lacking is data, which they're getting - and the harder Anthropic/the USG makes it to access claude in china, the more of that precious data they'll get!
I used to sort of entertain the "fast take-off breakaway" scenario as being plausible but not really anymore. The only genuine moat the frontier labs have is their product take-up, which isn't nothing, far from it, but it's not some unbreakable technological wall. Too late guys - it might have been too late for quite some time.
I wish it was true. I would gladly use a GPT 5.2 high model equivalent for coding (6 months old) if it was offered cheaper by Deepseek or Kimi. And I'm sure that's an extremely prevalent opinion by the millions of Claude and Codex users who are bothered by the costs.
However, they just don't perform that well in practice. That's the real issue. You can actually see it when you move away from open benchmarks. Deep seek 3.2 is 4% on Arc-AGI 2 [1], while GPT 5.2 high is 52% and GPT 5.5 pro high is 84.6%. That's the real reason why nobody is using these models for serious work. It's incredibly frustrating.
In addition, I already feel the pain myself on the model restriction. I'll asking my codex 5.5 agent to crawl a website - BOOM, cybersecurity warning on my account. I'll ask it to fix SSH on my local network - another warning. I'm worried about the day my account would be randomly banned and I cannot create a new one. OpenAI already asks you to perform full identification in order to eliminate these warnings - probably exactly for that - so that if they ban you, it's permanent.
I worked extensively on ARC AGI before and one thing is SURE as hell. OpenAI and Gemini in particular use this as marketing material. You can correlate the benchmark release with stock price increase. They feed synthetic datasets of ARC into their models to boost the numbers. There is no doubt in my mind Gemini is no better than DeepSeek other than being specifically fine tuned for ARC AGI. Heck, they even say so and they say they have paid annotations for ARC. Again, economic incentives.
In terms of whether these models are actually better at the benchmarks, likely not. See ARC 3, where the gap is diminishingly small.
Why are you bringing up an outdated Chinese model from 6 months ago to compare to a US model from 6 months ago? The outdated Chinese model will have performance from ~12 months ago, obviously. But today's Chinese model DeepSeek 4 has performance not far from the US model 6 months ago; 46% compared to 52% from 5.2.
I 100% agree with you, but I've been convinced over the last year that it's a time and scale issue, not anything fundamental.
The Chinese models right now are in a weird spot. Compared to the frontiers, both their pre and post training is woeful - tiny, resource constrained in every dimension including human, slow. I'd compare it to OpenAI 5 years ago except I think even then OpenAI had way more!
But they "cheat" quite a lot in distillation and very benchmark-focussed RL and that's where you get this superficial quality in the leaderboards that doesn't match up when you go off-script. Arc is a great example in that it really belies an "inferior soul" at the heart of it all.
What gives me great hope though is that those same scaling laws that Altman and others have been hyping forever will absolutely kick in for the Chinese labs just as they did for the US ones, and I don't think anything can stop that process now. So they will catch up. It won't be tomorrow, but it's not going to be 10 years either. 3-5 would be my reasonably educated guess.
And the final risk, that China itself might try to restrict availability of the tsunami of GPU or other AI hardware it will inevitably produce - well, I just can't really imagine a country that has been configuring itself for the last 40 years as a single purpose export machine deciding that actually, no, it doesn't want to export something.
About the model restrictions - absolutely. I've been trying to do security research on my own software and the frontier models immediately get suspicious. I've been playing with the local ones much more this year basically because of this. They have deficiencies, for sure - they feel very "hollow" compared to the major labs. But I've talked to a lot of people, and the consensus is pretty clear - just a matter of time.
Have you tried the latest DeepSeek v4 Pro inside of the Claude Code harness? It's not listed in that site.
It definitely 'feels like' it is as good as Claude for many regular web app coding tasks (though I don't have real benchmarks). And it is comically cheap.
I'm not suggesting it is better than the latest Claude or codex models, but it seems 'good enough' for a lot of use cases in my limited real world testing.
They're not even that much cheaper (1/2 price per task according to Artificial Analysis) once you account for lower token usage of GPT-5.5. I can't justify it when factoring in the extra time wasted, and the cheap codex usage I get through the monthly plan. Frontier intelligence is not a commodity product ... yet.
Arc has no predictive power whatsoever. I always use the best models available. So far I haven't found a task that chineses models cannot solve very quickly and reasonably. Do you have any examples where they failed for you?
If you want something close to claude, use glm 5.1 with claude code. Their subscription price is no longer x10 times cheaper now though (at best 2 times cheaper)
And yet Claude six months ago was amazing and good enough for you.
This shows that AI cloud consumption is just a conspicuous consumption status symbol, nobody knows why they need cloud AI or what problem they are even solving.
Which is why, I believe, the big AI companies are starting to focus and roll out vertical products more. They know that the models themselves aren't sticky, people can easily switch between different models with not much hassle.
I think the big AI companies are trying to transform into the next Microsoft. Completely capture both enterprise and consumers.
"I think the big AI companies are trying to transform into the next Microsoft. Completely capture both enterprise and consumers."
That is going to be a failing strategy though. Whatever OpenAI or Anthropic implement, Microsoft and Google can trivially copy and provide to their existing customers that are already deeply invested in their platforms.
> There is no secret sauce the US labs have that the Chinese ones don't, or won't have soon enough.
Over last year it seems that the only thing US labs are ahead is money spent. At least half of technical innovations if not more came from Chinese labs and was published openly.
Broad and deep capital markets are a real competitive moat for the USA. No other country or economic bloc can quickly deploy huge amounts of capital to new opportunities nearly as fast. China can work around that to an extent with a command economy that focuses resources on national strategic priorities but it's slower and less effective over the long term.
All of the reasons in the article also apply to Chinese companies. If a Chinese model becomes good enough to make it significantly easier to hack Chinese government servers, do you think they'll allow random people unfettered access to it?
The economic pressures are the same, too. Currently, Chinese models are offered for cheap or in some cases provide weights for free because that's the only way to gain traction. (That closed-weight releases by Baidu, Bytedance, iFlyTek etc. hardly generate any buzz bears that out, as does the fact that when Alibaba does a closed-weight release, someone always gets confused because they associate the Qwen brand with open models.) At some point, their investors are going to want profits, not just user counts. That means higher prices, or no more new models.
If there's no secret sauce and all you need is scale, that would actually be kind of the worst-case scenario for catching up to the frontier, since scaling is expensive and the frontier model companies have easier access to capital as well as higher revenues.
> If a Chinese model becomes good enough to make it significantly easier to hack Chinese government servers, do you think they'll allow random people unfettered access to it?
They aren't trying to become that good, nor do they need to in order to have real positive impact. Models like Mythos are estimated to be humongous even on a datacenter-wide scale, which is actually a big factor in its limited availability at present. It's mostly helpful as a one-of-a-kind proof of concept, to answer the question of whether AI can still plausibly scale by growing capabilities and what happens to alignment concerns when you do that.
Harness engineering is a moat. There’s user loyalty and reliance on the chassis that Claude is on, for example, just like there’s more market share by MacOS+WindowsOS over Linux Open Source.
The industry on tooling have been very much moving in direction of "plug the AI of your choosing" for a while now, and given how much Anthropic fights the 3rd party tools they are definitely afraid to be left in the dust.
> just like there’s more market share by MacOS+WindowsOS over Linux Open Source.
It's hard to change OS. It's not hard to jump from one AI tool to another
It's absolutely NOT a moat. Making a harness is the EASY part.
If you had said "marketing is a moat" then yes, I would say you were right. But creating a harness equal to or better than Claude Code is trivial. The CC harness is actually shit. There are tons of open-source harnesses than work better than CC while using Opus via OpenRouter.
But 1) people use other models with that same harness. 2) I moved on from Claude Code and all the features I cared for up and running in less than a couple days. Without even looking for available plugins or extensions.
I mean, if that’s the case, then Anthropic themselves are currently actively filling in that moat with nice, solid, walkable dirt. Claude Code may have been a moat 6 months ago but these days you’ll want to replace the “m” with a “bl”.
I agree the genie is out of the bottle technologically. I'm less convinced that means access stops being politically and economically important. The bottle may be gone but the best lamps are still expensive
But a “good enough” lamp just got a lot cheaper. The cost of tokens on DeepSeek V4 Pro is so low I don’t even think about and currently am trying to figure out useful things for as many agents simultaneously running as I can. What would have cost $150 less than a year ago now costs 35¢.
Likewise Qwen 3.6 absolutely blows me away and that’s on a 35b 6-bit model on a local 5090. Same thing, busy trying to find stuff to do to keep it busy 24/7.
I can still find some niches for Opus 4.7 but being able to attack problems and not worry about consumption is a game changer.
I would agree, the only thing Kimi is really missing is stability and harness training, For general chat tasks I consider it mostly on par. Occasionally I'll give the same problem to Kimi, Claude, GPT, Gemini and it's not unusual to see Kimi correctly figure out some kind of weird extra thing that the others missed, like some kind of mentally unstable savant.
> There is no secret sauce the US labs have that the Chinese ones don't, or won't have soon enough
This is not just about mainland China though. The current US government is extremely selfish and self-centered. Other countries really need to consider for their own long-term situation here.
I’m flocking from GPT to opus every week for the past 3 months and always come back.
The point isn’t that gpt is better, it’s that it is so much better for my work it isn’t even sticky, it’s reinforced concrete. I use opus 1% of the time because it writes better and it’s sticky there.
Yes I’ll switch approximately immediately if opus or Gemini (which I use more than opus!) is better for what I do, but at this point frontier model tokens are not fungible.
The large AI houses arguably ensure that model switching be a natural action for their clients, by switching the default model of their flagship offerings every few months. Such is the price of progress.
Today's tech echoes 1960-1970 mainframe era: very centralized around a handful of companies controlling "massive cloud compute" in bespoke mainframe-like topology.
All of that will all be legacy in a couple of years. Today's B200 clusters are tomorrow's e-waste. Decentralization might happen gradually or abruptly. But to me it's obvious that we'll be thinking of high-tech tensor processors and GPUs the way we thought of individual transistors and tube amplifiers in the 1980s.
If AI turns out to be the revolution it purports to be, than the underlying hardware will change much more rapidly than it did with ICs and microprocessors in the late 1970s. Today's hot is tomorrow's junk.
It's basically converted sand. Most of that conversion happens in Taiwan at the moment. Which is considered, by China, to be one of their provinces and as a protectorate by the usa. Hence the interest in that region....
No mention of open weights anywhere in the piece, which is weird. Qwen, Llama, DeepSeek are months behind frontier, not years. If you're a European startup worried about getting cut off from Anthropic's API in 2027, the real question is what the open-weight frontier looks like then. Probably pretty capable. That undercuts most of the doom scenario.
Also, he concedes Mythos-level capabilities will be cheap next year, then handwaves it with "you need the best AI, not good-enough AI." For most use cases, frontier minus six months is fine.
Open weights undercut the absolute cutoff scenario. They don't fully solve the question of who gets the best model first, who gets enough tokens to use it heavily, and who gets to integrate it into sensitive workflows without waiting for permission
Affordability of hardware that can run local LLMs is a real factor, too. Not sure when RAM prices are going down, but with everything that’s happening and can happen in the world right now, it doesn’t look like it’ll drop in the near or medium-term
No one is going to run models that are comparable to frontier locally without spending enormous sums for use at scale or in large orgs. Even with cheap RAM, you will still need a very large budget for frontier-level capability.
Open models that are competitive with frontier will be used on shared hosts.
Open weight models does not means you can run them on your laptop (except for the small ones). It means that someone independent (a cloud provider, another company ...) can build big computers that are capable ton run those models and provide you a metered usage.
At the end of the day, as a consumer, you still pay per token (or per something) to your provider, except you can chose from multiple providers with your own criteria. If you want to use DeepSeek v4 hosted in Europe, it's possible.
> Open weights will remain open only if they’re significantly worse than the frontier weights.
This makes the assumption that you earn more money by selling access to the model than by releasing the weights. That might be true for a company, but a US adversary might profit more from tanking the US economy. NVIDIA's stock dropped by 17% in a single day after DeepSeek-R1 was released, and the share of tech companies in the S&P 500 has only risen since then.
1. Your European startup will be competing with others using a much better frontier model. In a scenario where you already have other major disadvantages (access to capital, labor), you might be outcompeted
2. Open models have been keeping pace very nicely, but they rely on distillation of frontier models. If the race gets really tight, this could be affected so that the time gap grows larger (ie, it's very unlikely anyone but Anthropic is distilling from Mythos at the moment)
> 1. Your European startup will be competing with others using a much better frontier model.
If the small (and I'd even say, sometimes imperceptible) difference between Opus & DeepSeek v4 Pro is such a disadvantage for your startup, it's that your startup have an issue, not the LLM.
At the end of the day, your startup is there to solve real problems and even before the LLMs, being fast at coding things have never been such a huge competitive advantage compared to marketing, sales, customer support, product vision ...
Llama is not months behind GPT 5.5 Pro. I don't think Qwen or DeepSeek are either.
edit: I'm specifically referring to the "5.5 Pro" model, not regular 5.5 with Pro tier subscription. Claude has no model available that's comparable to 5.5 Pro either.
I’ve used DeepSeek 4 Pro through Claude. It’s fine. Plans are similar to what sonnet/opus make. Same massage-the-plan -> massage-the-code loop. Maybe the code is a bit worse, but that’s the “months behind” thing.
The thing is, vast majority of code tasks aren’t a venture into the unknown. We as an industry for the most part build CRUD interfaces and dashboards. That can be achieved, with supervision, with frontier open-weights models quite well.
Someone recently made a graph showing that the gap between US American frontier LLMs and Chinese open weight LLMs (including DeepSeek v4) is widening. Unfortunately I can't find it anymore.
Open models are pretty good at this point but the problem is that they are limited by the tooling and infrastructure that surrounds them. For example, the last time I tried to set up web search with an open model, the experience was pretty bad.
In our company of 24 employees, we get by with two DGX Sparks. We don't use AI heavily, but each Spark can serve about 6-8 concurrent requests with a full context lenght of 256k, which is decent. We get about ~35 t/s depending on the model we use (currently Qwen3.5 122B A10B and Qwen3 Coder Next), but we might set up a smaller model too for simpler tasks.
This works for us and will work for years to come. It is not SOTA, but it works darn well for our purposes, and we control the compute and data flowing through it, so totally worth it.
That's pretty nice actually, how much KV cache does that model require at full context? That tends to be the main limit to running concurrent requests locally, there's KV quantization but it has outsized negative impact on model quality.
I have experimented with both q8 and q4 for KV cache. I can't find any difference between q8 and fp16, but q4 suffers more when the context grows. q8 seems like a good compromise and gives us enough ctx for about 6-8 concurrent, full context sessions. But we have not fully tested those limits yet, as the context windows rarely reach the limit.
This is pretty cool. How would you say that these open models compare to SOTA on coding tasks? I pay $200/mo for Claude Max but honestly this sounds way more fun.
Nowadays I use our local setup 95% of the time, but it is not that long since that flipped for me personally.
Context: I have a $20 Claude Code subscription, and have used it for a handfull of small-ish projects the last year, in parallel with local models on my AMD 9700XTX (24GB) at home. Mostly Ministral 14B and more recently Qwen3.6 27B Dense 4q.
Historically, the tooling (interferens engines and harness) has been the biggest challenge when using local models, a lot of the benefits from Claude Code was a rather unified and well oiled agent system. Local setups often bring with them sutle incompatibilities between models, inference engines and agent systems that are not obvious from initial testing, but cause trouble on projects larger than a couple of files.
The Spark setup at work is now at a point where I do not miss Claude, like at all. A big part of this is the harness and the tools available to the agent, most critically a good tool for searching online. I use my Kagi subscription to allow the models to fetch up-to-date information, and the Kagi MCP I use also has a summarizer which is very helpful in avoiding rapidly filling up the context window.
I mostly use Zed and it's native agent, which only recently got muuuch better, and on the terminal I use Pi with a minimal selection of extensions (currently pi-kagi-search, pi-smart-fetch, pi-btw and pi-diffloop). I also have Pi in Zed via the ACP, but it does not work so well with some of the extensions, especially the lack of a built-in permission system is a problem, when YOLO-mode is the only mode :)
Honestly, as long as you have a model that is decent at tool calling, your good. Having a solid and stable frame around your model makes a huge difference. The only caveat in all of this is that I spend most of my time on smaller projects and debugging on linux base systems, not huge and complex code bases, so your mileage might vary.
The next phase at work is to set up a chatGPT-like webinterface, and so far LibreChat is at the top of my shortlist. We had OpenWebUI for a while, but it is so bad at using MCP tools that it is practically non-functional for us. LibreChat is a bit more work to set up, but the interface and it's MCP story is much more solid. The goal is to plug in our internal helpdesk, docs and task manager system to LibreChat via MCPs to give us a quick way to query and gather information that is currently very time consuming to do on your own.
The distillation risk has been brewing for a while now. In a very real sense, the model is the data, so if the data is locked down because of how valuable it is, it was only a matter of time before fully open access to the models would be revoked.
There's also an additional economic concern that rarely gets mentioned: because no one has cracked continual learning, keeping models up-to-date and filling in gaps in performance requires retraining on an ever growing dataset. Granted, you aren't starting from scratch each time, but the scaling required just to stay relevant looks daunting.
I don't know where any this goes on a societal level, but I've believed since the release of deepseek r1 that access to frontier models would eventually be locked up behind contracts, since the only moats protecting the models themselves are purely artificial. It remains to be seen how effective China is at pushing the envelope, and whether they are interested in providing unfettered access. And on top of that, it remains to be seen how well these models actually turn out to scale in the long run.
They are also not getting the same quantity or quality of data as was possible in the first years of "ingest". Compared to the beginning, from here on it is more like a drip feed of new training data. Still immense volumes of data, but we are talking 1 year of data production from society versus centuries of text and data ingested in a short time frame.
For pre-training, yes. But for post-training you need high-quality labelled datasets for reinforcement learning. So far AI has been most successful in coding, because you can translate the usage into such datasets, and thus produce a virtuous cycle: More usage produces more data, which produces better models, which drives more usage.
The question is whether this same model can successfully be applied in disciplines like medicine, law, engineering, etc.
The more fundamental bottleneck is not even the frontier models, it's the datacenters. Let's say Europe breaks apart from the US completely tomorrow. It does not have enough datacenters (or GPUs in general) to sustain its inference needs even if it would resort to Chinese open models. And to build new datacenters, it would need to source parts from the US and China.
In other words, if AI does have continued significant economic impact, only the US and China would be able to leverage it completely. The rest of the world is implicitly betting that AI won't be good enough, or that eventually the compute curve flattens out so using a model that is 10x larger only leads to marginal benefits.
Speculative decoding is not that useful at scale, it's mostly about making local single-user inference faster. When you're batching multiple inferences together, that's already as fast as the verification you have to perform w/ speculative decoding.
There is "local AI" which is running on consumer grade hardware and "local AI" which still needs a datacenter (DeepSeek 4, GLM 4.7, etc). If you woke up tomorrow and could only use the latter you are about 6 months behind the frontier, if you have to rely on the former you are 2 or 3 years behind.
All these tricks like quantization and speculative decoding can also be used by the leading AI labs, which means they will simply have more compute than you at the end of the day. So far this has translated into better performance.
In theory yes. They've got a bargaining chip with TSMC. But it's unclear how much use that would be without a safe shipping route between Europe and Taiwan and/or a navy capable of maintaining such.
Considering the economic angle, one possible long term future is that access to frontier models is only realistic for the wealthiest 1%
They will use this access to the ultra intelligent models to increase their wealth further. Inequality will continue to be negatively impacted
The thing is, the open source models are are smart enough to do most work if the harness and orchestration is right. So even if the next gen model get locked behind monopoly pay walls build
Real things in the real world and fight for a humane world
The availability of open models with such capabilities are based on the goodwill of the Chinese. And that might end eventually, especially that the matter is one decision of Xi and the party.
True but I’d argue they are good enough now to do 10 years of continued workflow automation. It’s like the internal combustion engine or personal computer, at some point they were good enough for broad categories of work. I think that’s where the current models are.
Over on the image generation side, "frontier AI" seems to be coming along rather well. Watch this video, which was released eight days ago.[1] Can you find any flaws? Two years ago, just getting hands with the right number of fingers was tough. Last year, there were jarring errors in every scene. Now, very little is wrong. How much longer will anyone need Hollywood studios?
It is a LOT better than 2 years ago, but there are flaws and its unpleasant to watch. The most easy to spot is their shoes (which they weren't wearing 1 second ago) flying off their feet without being kicked off in the first 10 seconds.
But if progress keeps going I'm sure it will get to the point where my brain doesn't feel sick after watching it. I hope so, because I'm sure there's a lot of AI videos in my future, whether I want them or not.
Yes, such systems are still struggling with continuity.
(There might be a workflow solution to that. Part of the system needs to do the job of what old films list as the "continuity girl". For each shot, there's a blocking diagram of who stands where at the beginning of the shot. There's a description of what each character is wearing, holding, or touching. If something generated that for the end of each shot, and it was fed into the prompt for the beginning of the next shot, that would help maintain continuity. This is another example of where a concrete mid-level abstraction is needed to keep things on track.)
Anyone have any idea what tool generated this? It's way past Stable Diffusion.
Still in the uncanny valley for me. Like watching AI the film. That said, it’s 3-minutes long and maintains the setting across many different angles, zoom levels, etc. pretty impressive.
And even if there weren’t any jarring errors, and rest assured there’s about a billion of them, there’s no appeal to this. It’s all context free short unassociated clips of pretty faces dancing on a beach. And?
There’s no narrative, there’s now sense of reality, it’s just a sense of here’s a million pixels of colours that have proven to go well with each other, it’s _slop_.
It’s been years and the only place AI has conquered in visual entertainment is as a subpar Photoshop replacement to fill in the B-roll gaps for those that don’t have the patience or money to do it the proper way.
>How much longer will anyone need Hollywood studios?
As long as they need shots longer than a few seconds, I suppose
>Now, very little is wrong.
You think culture is a matter of whether you've genned bass drums in front of or behind the drummer? (or to be a little more critical, whether you've remembered to add "sprinkle in some racial diversity" after the first prompt). You got the Coca-Cola trademark just popping up throughout, though, good job there
The video doesn't even have to load to know it's AI generated. The channel profile thumbnail and the video description are dead giveaways. The first frame of the video has too many errors to be worth repeating here. The first 0.5 seconds of the video has implausible movement.
I hear you. It is impressive technically but as far as finding flaws, I will just say this. This looks like something aliens would create in a dystopian simulation based on very odd understanding of old movies. I found it quite unsettling. Do you really think this would replace films with real actors and real writers (assuming they left the millennial talk and "modern audience" stuff)? I think the memes and parodies for AI video is more interesting than this kind of thing.
Every continuous shot lasts no more than five to ten seconds. It's not a "give-away" as such, but it's certainly a tell. r/aivideo is chockful of this crap.
The thing that’s always missing from videos like this is how much prompting or manual editing it took. It’s always implied that it was a one-shot, when it almost certainly was not.
It is way better than some years ago but like every scene got something strange. Look closely. Look at them throwing shoes at 55s if you want something really obvious.
I think we'll know this is true when deepseek becomes illegal in the US.
I tried to sign up for deepseek API access directly from the company out of gratitude for the open source contribution (deepseek.com) but payments are blocked by US government rules.
For now open weight models are legal. Conceivably the government could claim that an open weight model "contained" forbidden information and simply ban it. We live in authoritarian times.
When my dad got his Computer science degree at Queens University (Ontario) in the late 1970s, there was an entire faculty-sized building on campus with no windows and few doors. This wasn’t just the CompSci department, the building WAS the computer.
This. It's extremely frustrating, AI can be a 24/7 tutor, but too many students use it to do their work instead.
We have to really rethink how and what we teach, and how we evaluate. Scoring (non-handwritten) homework is pointless, even contra-productive (because it incentivizes cheating, even for the students who don't want to, just to not be outscored by the cheaters). Hand-written homework means the students at least have to have read the work once...
And soon, with AI glasses, even in-classroom tests will be difficult.
Universities provide the AI to students. We have CoPilot in our 365 and the Ed majors get free AI crap all the time. Hard to make drugs illegal when you're the dealer
The uncomfortable implication is that "AI sovereignty" may end up being less about training your own GPT-class model and more about securing compute, energy, datacenter security and contractual access
Au contraire, I think we’ll soon see $10,000 self contained boxes (512 GB Mac Studio, basically) that you plug in to a LAN and have infinite inference.
People pay for cars and trucks (same cost scale) globally as they’re profit-generators for every user (as far as giving them access to jobs).
Yeah, so it's just business as usual: If you have ungodly amounts of money, you can essentially do anything, and if you don't, you can't. It's always been this way, and it'll always be this way. I don't see this as a world-ending issue.
> “The two AI superpowers are going to start talking. We’re going to set up a protocol in terms of how do we go forward with best practices for AI to make sure nonstate actors don’t get a hold of these models,” Bessent told Joe Kernen on Thursday, on the sidelines of President Donald Trump’s two-day meeting in Beijing with Chinese President Xi Jinping.
I wonder if the countries that don't have "AI Sovereignty" end up being like what Japan is now, technologically. It's stuck in 90's/early 2000's tech and norms (i.e. left behind) but its infrastructure and society chugs along (the demographic problem is a separate issue).
Would that make those countries more attractive to young people perhaps? As a place to grow and learn skills where the opportunities are non-existent in the AI Sovereign countries.
We should be aiming for less token usage, ideally none at all. The current AI is using LLMs to expanding horizontally but with the goal of achieving vertical progress - inventing truly new stuff and being able to eliminate our biggest problems. problems like cancer need only be solved once, and is no more tokens needed after that.
Think this somewhat underestimates economic pressures the US labs are under.
OpenAI etc need to make crazy revenue to get their investment math to work. Perhaps you can sell some tokens to privileged partners at a premium rate but I think they’ll need global scale ultimately
I would imagine not single everyone on HN have enough disposable income that allow us to subscribe Claude Max or other similar max plan of other models without thinking.
Some people mentioned open weight model, but there are two hurdles. One the current economic mean securing the best hardware is already stupidly expensive compare to a year or two ago. And the open weight model lack the magic that Claude/Gemini/OpenAI put in the proprietary one, meaning one will have to create their own agent that is clever enough to search the internet when it knows its training data is stale.
> I would imagine not single everyone on HN have enough disposable income that allow us to subscribe Claude Max or other similar max plan of other models without thinking.
You don't need the Max plan with other models if you're not going completely crazy. Other providers have much more generous limits than Anthropic.
Its worse than that - all AI features will get broken down into even finer slices and you will have to pay for everything based on the finest level of slice they can make and still make money.
Physics and economics will drive cost. Current token pricing is based on unsustainable investment and energy cost. However, this is more of an optimization problem than an inherent show stopper. Token cost will inevitably come down over time. But this could take a while before it catches up with demand. Manufacturing will step up to provide cheaper GPUs. Etc. There will be some consolidation but the whole thing will converge on something that should make long term economical sense.
Ultimately it's a resource control issue. To power AI you need land/space (to build on), water, energy, and lots of hardware. Hardware needs to be manufactured and engineered. It needs metals, some exotic materials, machines, etc. More resources in other words. If you look at China vs US here, they are really well positioned in terms of resources and supply chains. The US has fallen behind quite a bit on energy and all the critical resources needed to produce hardware. AI is bottle necked on a lot of stuff that China has or makes in abundance.
For the frontier models, there are a growing number of companies and countries that provide them. We're used to mostly talking about the US ones. But of course the Chinese have a lot of capability here and they are not that far behind. And that's judging by the models they choose to release under OSS licenses. Those models are not their frontier models. And there are a lot of other countries developing and using models that aren't necessarily talking openly about what they are doing.
The irony with these frontier models is that they only generate revenue if people can use them. Why sink billions in AI infrastructure and models without a revenue model?
The reality with Mythos is that you have to assume that the Chinese (and others) are not that far behind and may already be running an equivalent model that they just haven't told anyone about yet. Anthropic gate keeping Mythos and its findings is probably wise. But it's not long term sustainable to depend on that happening or working very well. Or even on them even being a leader in this space.
This is becoming an arms race between countries, and economies. And it's an economical and resource control race. Developing and researching in the open has advanced things massively. But it has also empowered the rest of the world. Both Anthropic and OpenAI are staffed with people from all over the world. You have to assume that they probably aren't very good at keeping things secret.
Those billions in AI datacenter infrastructure will eventually be repurposed to run smart models like Mythos, not ChatGPT or even Opus/Sonnet. That future "revenue model" is quite robust to any foreseeable competition from on-prem FLOSS inference. It's a natural fit to the actual capabilities of large datacenter-scale compute.
DeepSeek is not a distillation of Claude or ChatGPT - stating this is just idiotic politics at this point.
The Chinese labs have reached "escape velocity" long ago - they will continue development regardless of API access to US models or the willingness of US labs to share their research.
I'm not so sure about "soon" - the big labs are profiting from the discovery and experimentation efforts by independent contributors (openclaw, etc) and reducing their capabilities also reduces input from this side.
How much money are you all paying to use this tech? Last I even tried, it would cost my entire salary. Yet, everyone and their newborns are using it every day for everything. How is this possible?
Depends on which exact model we're talking about, and on your salary.
For example, with the $40/month Kimi Code subscription the limits are so generous that you can use it every day all the time for everything (basically just have an agent constantly running doing something) and never run out of tokens/hit the limits.
It took almost two centuries for broad middle-class living standards to become common for large population after Industrial revolution - and that happened after intense fight for rights and fair share of economic gains.
So, sure, AGI might raise equality - but that's only if we fight for it.
You need to understand that these models are provided by the corporate entities, they are expensive to maintain, iterate and run. There is still no strong correlation between the use of AI and the business outcomes so there should be a real ceiling to how much enterprises would pay for tokens. The gov is a usual choice to establish contracts and get some stability, similar to building nuclear reactors or military equipment. And posturing about limiting model access is just saying it is expensive to subsidise its use for cat image generation or call summaries.
I am pretty sure we have not found the killer app (like an IDE even) for us to extract all the possible value from the models yet. I would even go as far as to say that the synthesis between a human and AI could leverage average models to achieve a lot more compared to the model/agent working on its own.
edit: Just to add to this, I am going through Mythos scans and it is not perfect, very much similar to what pentesters would do with the added bloat of noise in reports about nonissues.
I hope regular people will stop using "national security" and "national interests" as euphemisms and framing, and will call these things a psychopathic fight for power.
Assuming that some humans are worse than others because of their flag picture and that they deserve less access to resources is barbarism. There is no security in limiting access to NSA-style entities; it's an absolute insecurity for everyone but them throughout the whole world. How is that in anyone's "interests"?
We see every day now how suspicious bugs that look exactly like backdoors (i.e., Microsoft BitLocker) get exposed. That's in humanity's interests (and those of particular nations as a subset) — not being subjugated by small rings of professional outlaws. We need these instruments to defend people, everywhere. We don't need to give a leverage to any state psycho. Let's make everyone of them weaker.
If Amodei and the co. were in charge the models would alert the police if someone said "boob" and the goys would only get GPT 2 level models, hell, even that might be too dangerous.
I suspect this was just a throwaway word usage, but its usage here ends up being pretty anti-Semitic, so probably worth reconsidering its use if that wasn’t the intention of your post.
What part of what he said was false? Dario Amodei and especially Sam Altman have been treating the general public like cattle. And goy simply means non-Jew, how can not talking about Jews be anti-Semitic?!
> And it doesn’t stop with the security questions: the Trump administration’s signature style of international engagement is to wield American leverage as a bundle. Deadlocks in trade negotiations are broken by threatening to withhold intelligence, tech deals are stalled by reference to food safety standards. And so I don’t know when a U.S. administration would choose to leverage its seemingly inevitable predeployment authority over frontier models to secure its broader interests, but I’m sure it would in due time. That means that even if we do everything ‘right’ on the security and economic side, frontier access is still fundamentally contingent as long as there’ll be divergences between governments’ strategic interests.
The Trump Administration telling the very neo-fascist oligarchs who bought him an election and bought him a ballroom to play nice with their toys? At the expense of rampant capitalism? Lol.
He already showed us the limit of his comprehension of the topic when he made EO 14179 limiting states from regulating AI.
Trump doesn't swing for perfect pitches. He is a madman, a lunatic, and a true moron. Do not give this man any credit. I would be shocked if he could tell you the time on an analog clock.
That's a weird way to characterize months of incessant "we have incontrovertible hard evidence but you can't see it yet" claims, which--when finally forced into the light--were laughed out of every court in the nation.
If it was just pure and innocent "questioning", things would be very different. We probably wouldn't have had the January 6th mob attack on Congress, for example.
As someone who actively monitors the Chinese internet as well, I believe we are heading toward a world split into two distinct AI spheres.
Coming from South Korea—a nation outside the US-China dichotomy—the fundamental issue I see is the closed nature of the American AI ecosystem. Products like Gemini, GPT, and Claude are API and subscription-based, meaning their pricing and access terms can change at any moment. If that volatility increases, developers desperate to escape vendor lock-in will inevitably turn to local models.
Chinese open-source models like Qwen and DeepSeek are already exerting massive influence over our domestic AI ecosystem. While the US still revolves around CUDA, China has built its own CANN ecosystem. Most impressively, Chinese local models are incredibly accessible, even for a foreigner like me.
I believe that while the US will retain dominance over the cutting-edge frontier inside Silicon Valley, the logical ecosystem—the models that individuals can actually download, run, modify, and build upon—will increasingly be dictated by China. Closed American models may lead in absolute performance, but open Chinese models will act as the foundational anchor against price resistance. If US companies attempt excessive price hikes, these powerful open models will cap those increases.
This feels remarkably parallel to the history of Linux servers. Data centers chose Linux because, at scale, avoiding licensing costs, maintaining deployment control, and escaping vendor lock-in are critical. Windows Server still plays a role where vendor accountability and specific enterprise integrations are required, but in large-scale infrastructure, open systems overwhelmingly won.
We are likely to see the exact same phenomenon in AI. A
n open local model doesn't have to be the absolute bset. If it is 'good enough,' cheap, easy to deploy, and free from volatile vendor pricing, it will become the core of the infrastructure layer.
If that happens, the foundational 'layer of thought' embedded in our systems might no longer be based on American cognitive frameworks, but on Chinese ones
I am certain that AI will be deeply integrated directly into our infrastructure. The reason is simple: spending time memorizing YAML syntax just to configure a CI/CD pipeline is a complete waste of time. Because of this, we will inevitably see a surge in services that orchestrate small, domain-specific agents tailored for these exact niches.
When that happens, are we really going to integrate expensive American model APIs to run them? Or will we just rent small GPU servers and spin up local models? I strongly believe the latter is far more likely.
Stop using this word so lightheartedly. You have no clue what you are talking about, and ridicule millions of people who have suffered through decades of oppression.
I am no-where near as concerned by this as I was a year ago, when I was expecting the axe to fall at any moment before the Chinese labs achieved some sort of escape velocity. I now think it's too late, all the cats are out of all the bags, there's no moat except maybe a temporal one of a few months, the genie is out of the bottle.
There is no secret sauce the US labs have that the Chinese ones don't, or won't have soon enough. Deepseek 4 and Kimi 2.5 are not quite Claude 4.5/GPT5.5 but there's no fundamental principle missing - they are strong evidence that there's no real advantage the "frontier" labs possess that isn't related to scale, which they will gain in time (if they even need to). The RL post-training techniques that work are widely known and easily copied. All Deepseek is really lacking is data, which they're getting - and the harder Anthropic/the USG makes it to access claude in china, the more of that precious data they'll get!
I used to sort of entertain the "fast take-off breakaway" scenario as being plausible but not really anymore. The only genuine moat the frontier labs have is their product take-up, which isn't nothing, far from it, but it's not some unbreakable technological wall. Too late guys - it might have been too late for quite some time.
I wish it was true. I would gladly use a GPT 5.2 high model equivalent for coding (6 months old) if it was offered cheaper by Deepseek or Kimi. And I'm sure that's an extremely prevalent opinion by the millions of Claude and Codex users who are bothered by the costs.
However, they just don't perform that well in practice. That's the real issue. You can actually see it when you move away from open benchmarks. Deep seek 3.2 is 4% on Arc-AGI 2 [1], while GPT 5.2 high is 52% and GPT 5.5 pro high is 84.6%. That's the real reason why nobody is using these models for serious work. It's incredibly frustrating.
In addition, I already feel the pain myself on the model restriction. I'll asking my codex 5.5 agent to crawl a website - BOOM, cybersecurity warning on my account. I'll ask it to fix SSH on my local network - another warning. I'm worried about the day my account would be randomly banned and I cannot create a new one. OpenAI already asks you to perform full identification in order to eliminate these warnings - probably exactly for that - so that if they ban you, it's permanent.
[1] https://arcprize.org/leaderboard
I worked extensively on ARC AGI before and one thing is SURE as hell. OpenAI and Gemini in particular use this as marketing material. You can correlate the benchmark release with stock price increase. They feed synthetic datasets of ARC into their models to boost the numbers. There is no doubt in my mind Gemini is no better than DeepSeek other than being specifically fine tuned for ARC AGI. Heck, they even say so and they say they have paid annotations for ARC. Again, economic incentives. In terms of whether these models are actually better at the benchmarks, likely not. See ARC 3, where the gap is diminishingly small.
6 replies →
> Deep seek 3.2 is 4% on Arc-AGI 2
Why are you bringing up an outdated Chinese model from 6 months ago to compare to a US model from 6 months ago? The outdated Chinese model will have performance from ~12 months ago, obviously. But today's Chinese model DeepSeek 4 has performance not far from the US model 6 months ago; 46% compared to 52% from 5.2.
7 replies →
I 100% agree with you, but I've been convinced over the last year that it's a time and scale issue, not anything fundamental.
The Chinese models right now are in a weird spot. Compared to the frontiers, both their pre and post training is woeful - tiny, resource constrained in every dimension including human, slow. I'd compare it to OpenAI 5 years ago except I think even then OpenAI had way more!
But they "cheat" quite a lot in distillation and very benchmark-focussed RL and that's where you get this superficial quality in the leaderboards that doesn't match up when you go off-script. Arc is a great example in that it really belies an "inferior soul" at the heart of it all.
What gives me great hope though is that those same scaling laws that Altman and others have been hyping forever will absolutely kick in for the Chinese labs just as they did for the US ones, and I don't think anything can stop that process now. So they will catch up. It won't be tomorrow, but it's not going to be 10 years either. 3-5 would be my reasonably educated guess.
And the final risk, that China itself might try to restrict availability of the tsunami of GPU or other AI hardware it will inevitably produce - well, I just can't really imagine a country that has been configuring itself for the last 40 years as a single purpose export machine deciding that actually, no, it doesn't want to export something.
About the model restrictions - absolutely. I've been trying to do security research on my own software and the frontier models immediately get suspicious. I've been playing with the local ones much more this year basically because of this. They have deficiencies, for sure - they feel very "hollow" compared to the major labs. But I've talked to a lot of people, and the consensus is pretty clear - just a matter of time.
5 replies →
Have you tried the latest DeepSeek v4 Pro inside of the Claude Code harness? It's not listed in that site.
It definitely 'feels like' it is as good as Claude for many regular web app coding tasks (though I don't have real benchmarks). And it is comically cheap.
I'm not suggesting it is better than the latest Claude or codex models, but it seems 'good enough' for a lot of use cases in my limited real world testing.
6 replies →
They're not even that much cheaper (1/2 price per task according to Artificial Analysis) once you account for lower token usage of GPT-5.5. I can't justify it when factoring in the extra time wasted, and the cheap codex usage I get through the monthly plan. Frontier intelligence is not a commodity product ... yet.
1 reply →
Arc has no predictive power whatsoever. I always use the best models available. So far I haven't found a task that chineses models cannot solve very quickly and reasonably. Do you have any examples where they failed for you?
If you want something close to claude, use glm 5.1 with claude code. Their subscription price is no longer x10 times cheaper now though (at best 2 times cheaper)
And yet Claude six months ago was amazing and good enough for you.
This shows that AI cloud consumption is just a conspicuous consumption status symbol, nobody knows why they need cloud AI or what problem they are even solving.
1 reply →
Which is why, I believe, the big AI companies are starting to focus and roll out vertical products more. They know that the models themselves aren't sticky, people can easily switch between different models with not much hassle.
I think the big AI companies are trying to transform into the next Microsoft. Completely capture both enterprise and consumers.
"I think the big AI companies are trying to transform into the next Microsoft. Completely capture both enterprise and consumers."
That is going to be a failing strategy though. Whatever OpenAI or Anthropic implement, Microsoft and Google can trivially copy and provide to their existing customers that are already deeply invested in their platforms.
> There is no secret sauce the US labs have that the Chinese ones don't, or won't have soon enough.
Over last year it seems that the only thing US labs are ahead is money spent. At least half of technical innovations if not more came from Chinese labs and was published openly.
Broad and deep capital markets are a real competitive moat for the USA. No other country or economic bloc can quickly deploy huge amounts of capital to new opportunities nearly as fast. China can work around that to an extent with a command economy that focuses resources on national strategic priorities but it's slower and less effective over the long term.
2 replies →
All of the reasons in the article also apply to Chinese companies. If a Chinese model becomes good enough to make it significantly easier to hack Chinese government servers, do you think they'll allow random people unfettered access to it?
The economic pressures are the same, too. Currently, Chinese models are offered for cheap or in some cases provide weights for free because that's the only way to gain traction. (That closed-weight releases by Baidu, Bytedance, iFlyTek etc. hardly generate any buzz bears that out, as does the fact that when Alibaba does a closed-weight release, someone always gets confused because they associate the Qwen brand with open models.) At some point, their investors are going to want profits, not just user counts. That means higher prices, or no more new models.
If there's no secret sauce and all you need is scale, that would actually be kind of the worst-case scenario for catching up to the frontier, since scaling is expensive and the frontier model companies have easier access to capital as well as higher revenues.
> If a Chinese model becomes good enough to make it significantly easier to hack Chinese government servers, do you think they'll allow random people unfettered access to it?
They aren't trying to become that good, nor do they need to in order to have real positive impact. Models like Mythos are estimated to be humongous even on a datacenter-wide scale, which is actually a big factor in its limited availability at present. It's mostly helpful as a one-of-a-kind proof of concept, to answer the question of whether AI can still plausibly scale by growing capabilities and what happens to alignment concerns when you do that.
1 reply →
Harness engineering is a moat. There’s user loyalty and reliance on the chassis that Claude is on, for example, just like there’s more market share by MacOS+WindowsOS over Linux Open Source.
I regularly switch between codex and Claude in the same sessions. I’d throw in other models if I could.
Data governance and enterprise sales is a moat. The harnesses aren’t.
The industry on tooling have been very much moving in direction of "plug the AI of your choosing" for a while now, and given how much Anthropic fights the 3rd party tools they are definitely afraid to be left in the dust.
> just like there’s more market share by MacOS+WindowsOS over Linux Open Source.
It's hard to change OS. It's not hard to jump from one AI tool to another
It's absolutely NOT a moat. Making a harness is the EASY part.
If you had said "marketing is a moat" then yes, I would say you were right. But creating a harness equal to or better than Claude Code is trivial. The CC harness is actually shit. There are tons of open-source harnesses than work better than CC while using Opus via OpenRouter.
I thought so too.
But 1) people use other models with that same harness. 2) I moved on from Claude Code and all the features I cared for up and running in less than a couple days. Without even looking for available plugins or extensions.
> Harness engineering is a moat.
I mean, if that’s the case, then Anthropic themselves are currently actively filling in that moat with nice, solid, walkable dirt. Claude Code may have been a moat 6 months ago but these days you’ll want to replace the “m” with a “bl”.
I agree the genie is out of the bottle technologically. I'm less convinced that means access stops being politically and economically important. The bottle may be gone but the best lamps are still expensive
But a “good enough” lamp just got a lot cheaper. The cost of tokens on DeepSeek V4 Pro is so low I don’t even think about and currently am trying to figure out useful things for as many agents simultaneously running as I can. What would have cost $150 less than a year ago now costs 35¢.
Likewise Qwen 3.6 absolutely blows me away and that’s on a 35b 6-bit model on a local 5090. Same thing, busy trying to find stuff to do to keep it busy 24/7.
I can still find some niches for Opus 4.7 but being able to attack problems and not worry about consumption is a game changer.
Virtually no one is going to pay for the best performing lamp if the next best lamp does 90% as good for an order of magnitude cheaper.
I will say, as pointed out by others, DeepSeek and other Chinese providers still lack a bit in the tooling that Claude has, but they'll get there.
9 replies →
I would agree, the only thing Kimi is really missing is stability and harness training, For general chat tasks I consider it mostly on par. Occasionally I'll give the same problem to Kimi, Claude, GPT, Gemini and it's not unusual to see Kimi correctly figure out some kind of weird extra thing that the others missed, like some kind of mentally unstable savant.
> There is no secret sauce the US labs have that the Chinese ones don't, or won't have soon enough
This is not just about mainland China though. The current US government is extremely selfish and self-centered. Other countries really need to consider for their own long-term situation here.
> The only genuine moat the frontier labs have is their product take-up
And even then, their is no stickiness. For most use cases there isn’t much value in one frontier model over the other.
Just have to look at the people flocking from one to the other for whatever reason.
I’m flocking from GPT to opus every week for the past 3 months and always come back.
The point isn’t that gpt is better, it’s that it is so much better for my work it isn’t even sticky, it’s reinforced concrete. I use opus 1% of the time because it writes better and it’s sticky there.
Yes I’ll switch approximately immediately if opus or Gemini (which I use more than opus!) is better for what I do, but at this point frontier model tokens are not fungible.
2 replies →
The large AI houses arguably ensure that model switching be a natural action for their clients, by switching the default model of their flagship offerings every few months. Such is the price of progress.
What about access to GPUs and memory? This is becoming a pretty major bottleneck.
Today's tech echoes 1960-1970 mainframe era: very centralized around a handful of companies controlling "massive cloud compute" in bespoke mainframe-like topology.
All of that will all be legacy in a couple of years. Today's B200 clusters are tomorrow's e-waste. Decentralization might happen gradually or abruptly. But to me it's obvious that we'll be thinking of high-tech tensor processors and GPUs the way we thought of individual transistors and tube amplifiers in the 1980s.
If AI turns out to be the revolution it purports to be, than the underlying hardware will change much more rapidly than it did with ICs and microprocessors in the late 1970s. Today's hot is tomorrow's junk.
5 replies →
It's basically converted sand. Most of that conversion happens in Taiwan at the moment. Which is considered, by China, to be one of their provinces and as a protectorate by the usa. Hence the interest in that region....
Everyone is expecting them to invade Taiwan, but why not merely extort Taiwan?
3 replies →
No mention of open weights anywhere in the piece, which is weird. Qwen, Llama, DeepSeek are months behind frontier, not years. If you're a European startup worried about getting cut off from Anthropic's API in 2027, the real question is what the open-weight frontier looks like then. Probably pretty capable. That undercuts most of the doom scenario.
Also, he concedes Mythos-level capabilities will be cheap next year, then handwaves it with "you need the best AI, not good-enough AI." For most use cases, frontier minus six months is fine.
Open weights undercut the absolute cutoff scenario. They don't fully solve the question of who gets the best model first, who gets enough tokens to use it heavily, and who gets to integrate it into sensitive workflows without waiting for permission
Affordability of hardware that can run local LLMs is a real factor, too. Not sure when RAM prices are going down, but with everything that’s happening and can happen in the world right now, it doesn’t look like it’ll drop in the near or medium-term
No one is going to run models that are comparable to frontier locally without spending enormous sums for use at scale or in large orgs. Even with cheap RAM, you will still need a very large budget for frontier-level capability.
Open models that are competitive with frontier will be used on shared hosts.
10 replies →
Open weight models does not means you can run them on your laptop (except for the small ones). It means that someone independent (a cloud provider, another company ...) can build big computers that are capable ton run those models and provide you a metered usage.
At the end of the day, as a consumer, you still pay per token (or per something) to your provider, except you can chose from multiple providers with your own criteria. If you want to use DeepSeek v4 hosted in Europe, it's possible.
1 reply →
Open weights will remain open only if they’re significantly worse than the frontier weights.
Before you challenge with benchmarks, consider the labs which release open weight models have internal testing and unpublished results.
> Open weights will remain open only if they’re significantly worse than the frontier weights.
This makes the assumption that you earn more money by selling access to the model than by releasing the weights. That might be true for a company, but a US adversary might profit more from tanking the US economy. NVIDIA's stock dropped by 17% in a single day after DeepSeek-R1 was released, and the share of tech companies in the S&P 500 has only risen since then.
1 reply →
There are two problems with that scenario:
1. Your European startup will be competing with others using a much better frontier model. In a scenario where you already have other major disadvantages (access to capital, labor), you might be outcompeted
2. Open models have been keeping pace very nicely, but they rely on distillation of frontier models. If the race gets really tight, this could be affected so that the time gap grows larger (ie, it's very unlikely anyone but Anthropic is distilling from Mythos at the moment)
> 1. Your European startup will be competing with others using a much better frontier model.
If the small (and I'd even say, sometimes imperceptible) difference between Opus & DeepSeek v4 Pro is such a disadvantage for your startup, it's that your startup have an issue, not the LLM.
At the end of the day, your startup is there to solve real problems and even before the LLMs, being fast at coding things have never been such a huge competitive advantage compared to marketing, sales, customer support, product vision ...
1 reply →
Llama is not months behind GPT 5.5 Pro. I don't think Qwen or DeepSeek are either.
edit: I'm specifically referring to the "5.5 Pro" model, not regular 5.5 with Pro tier subscription. Claude has no model available that's comparable to 5.5 Pro either.
I’ve used DeepSeek 4 Pro through Claude. It’s fine. Plans are similar to what sonnet/opus make. Same massage-the-plan -> massage-the-code loop. Maybe the code is a bit worse, but that’s the “months behind” thing.
The thing is, vast majority of code tasks aren’t a venture into the unknown. We as an industry for the most part build CRUD interfaces and dashboards. That can be achieved, with supervision, with frontier open-weights models quite well.
2 replies →
There's no evidence there's any 5.5 Pro model distinct from 5.5 xhigh or whatever.
https://developers.openai.com/api/docs/models
2 replies →
Someone recently made a graph showing that the gap between US American frontier LLMs and Chinese open weight LLMs (including DeepSeek v4) is widening. Unfortunately I can't find it anymore.
Update: GPT-5.5 found it.
Article: https://www.nist.gov/news-events/news/2026/05/caisi-evaluati...
Graph: https://www.nist.gov/sites/default/files/images/2026/05/01/1...
Give it time. It's inevitably a logistic curve.
1 reply →
This is propaganda, not data.
If the Chinese government published a graph that said the opposite, would you consider that a serious and objective source?
1 reply →
Someone is an official website of the united states gouvernement. I would prefer another source.
1 reply →
Open models are pretty good at this point but the problem is that they are limited by the tooling and infrastructure that surrounds them. For example, the last time I tried to set up web search with an open model, the experience was pretty bad.
In our company of 24 employees, we get by with two DGX Sparks. We don't use AI heavily, but each Spark can serve about 6-8 concurrent requests with a full context lenght of 256k, which is decent. We get about ~35 t/s depending on the model we use (currently Qwen3.5 122B A10B and Qwen3 Coder Next), but we might set up a smaller model too for simpler tasks.
This works for us and will work for years to come. It is not SOTA, but it works darn well for our purposes, and we control the compute and data flowing through it, so totally worth it.
That's pretty nice actually, how much KV cache does that model require at full context? That tends to be the main limit to running concurrent requests locally, there's KV quantization but it has outsized negative impact on model quality.
I have experimented with both q8 and q4 for KV cache. I can't find any difference between q8 and fp16, but q4 suffers more when the context grows. q8 seems like a good compromise and gives us enough ctx for about 6-8 concurrent, full context sessions. But we have not fully tested those limits yet, as the context windows rarely reach the limit.
This is pretty cool. How would you say that these open models compare to SOTA on coding tasks? I pay $200/mo for Claude Max but honestly this sounds way more fun.
Nowadays I use our local setup 95% of the time, but it is not that long since that flipped for me personally.
Context: I have a $20 Claude Code subscription, and have used it for a handfull of small-ish projects the last year, in parallel with local models on my AMD 9700XTX (24GB) at home. Mostly Ministral 14B and more recently Qwen3.6 27B Dense 4q.
Historically, the tooling (interferens engines and harness) has been the biggest challenge when using local models, a lot of the benefits from Claude Code was a rather unified and well oiled agent system. Local setups often bring with them sutle incompatibilities between models, inference engines and agent systems that are not obvious from initial testing, but cause trouble on projects larger than a couple of files.
The Spark setup at work is now at a point where I do not miss Claude, like at all. A big part of this is the harness and the tools available to the agent, most critically a good tool for searching online. I use my Kagi subscription to allow the models to fetch up-to-date information, and the Kagi MCP I use also has a summarizer which is very helpful in avoiding rapidly filling up the context window.
I mostly use Zed and it's native agent, which only recently got muuuch better, and on the terminal I use Pi with a minimal selection of extensions (currently pi-kagi-search, pi-smart-fetch, pi-btw and pi-diffloop). I also have Pi in Zed via the ACP, but it does not work so well with some of the extensions, especially the lack of a built-in permission system is a problem, when YOLO-mode is the only mode :)
Honestly, as long as you have a model that is decent at tool calling, your good. Having a solid and stable frame around your model makes a huge difference. The only caveat in all of this is that I spend most of my time on smaller projects and debugging on linux base systems, not huge and complex code bases, so your mileage might vary.
The next phase at work is to set up a chatGPT-like webinterface, and so far LibreChat is at the top of my shortlist. We had OpenWebUI for a while, but it is so bad at using MCP tools that it is practically non-functional for us. LibreChat is a bit more work to set up, but the interface and it's MCP story is much more solid. The goal is to plug in our internal helpdesk, docs and task manager system to LibreChat via MCPs to give us a quick way to query and gather information that is currently very time consuming to do on your own.
The distillation risk has been brewing for a while now. In a very real sense, the model is the data, so if the data is locked down because of how valuable it is, it was only a matter of time before fully open access to the models would be revoked.
There's also an additional economic concern that rarely gets mentioned: because no one has cracked continual learning, keeping models up-to-date and filling in gaps in performance requires retraining on an ever growing dataset. Granted, you aren't starting from scratch each time, but the scaling required just to stay relevant looks daunting.
I don't know where any this goes on a societal level, but I've believed since the release of deepseek r1 that access to frontier models would eventually be locked up behind contracts, since the only moats protecting the models themselves are purely artificial. It remains to be seen how effective China is at pushing the envelope, and whether they are interested in providing unfettered access. And on top of that, it remains to be seen how well these models actually turn out to scale in the long run.
They are also not getting the same quantity or quality of data as was possible in the first years of "ingest". Compared to the beginning, from here on it is more like a drip feed of new training data. Still immense volumes of data, but we are talking 1 year of data production from society versus centuries of text and data ingested in a short time frame.
For pre-training, yes. But for post-training you need high-quality labelled datasets for reinforcement learning. So far AI has been most successful in coding, because you can translate the usage into such datasets, and thus produce a virtuous cycle: More usage produces more data, which produces better models, which drives more usage.
The question is whether this same model can successfully be applied in disciplines like medicine, law, engineering, etc.
This is a good point, especially the "model is the data" framing
The more fundamental bottleneck is not even the frontier models, it's the datacenters. Let's say Europe breaks apart from the US completely tomorrow. It does not have enough datacenters (or GPUs in general) to sustain its inference needs even if it would resort to Chinese open models. And to build new datacenters, it would need to source parts from the US and China.
In other words, if AI does have continued significant economic impact, only the US and China would be able to leverage it completely. The rest of the world is implicitly betting that AI won't be good enough, or that eventually the compute curve flattens out so using a model that is 10x larger only leads to marginal benefits.
> The more fundamental bottleneck is not even the frontier models, it's the datacenters.
Is it even though? Quantization and speculative decoding are improving the local AI story by leaps and bounds every month.
Speculative decoding is not that useful at scale, it's mostly about making local single-user inference faster. When you're batching multiple inferences together, that's already as fast as the verification you have to perform w/ speculative decoding.
1 reply →
There is "local AI" which is running on consumer grade hardware and "local AI" which still needs a datacenter (DeepSeek 4, GLM 4.7, etc). If you woke up tomorrow and could only use the latter you are about 6 months behind the frontier, if you have to rely on the former you are 2 or 3 years behind.
All these tricks like quantization and speculative decoding can also be used by the leading AI labs, which means they will simply have more compute than you at the end of the day. So far this has translated into better performance.
3 replies →
but ASML is in Europe - so they hold at least some critical part of the stack.
In theory yes. They've got a bargaining chip with TSMC. But it's unclear how much use that would be without a safe shipping route between Europe and Taiwan and/or a navy capable of maintaining such.
Considering the economic angle, one possible long term future is that access to frontier models is only realistic for the wealthiest 1% They will use this access to the ultra intelligent models to increase their wealth further. Inequality will continue to be negatively impacted
The thing is, the open source models are are smart enough to do most work if the harness and orchestration is right. So even if the next gen model get locked behind monopoly pay walls build Real things in the real world and fight for a humane world
The availability of open models with such capabilities are based on the goodwill of the Chinese. And that might end eventually, especially that the matter is one decision of Xi and the party.
True but I’d argue they are good enough now to do 10 years of continued workflow automation. It’s like the internal combustion engine or personal computer, at some point they were good enough for broad categories of work. I think that’s where the current models are.
Over on the image generation side, "frontier AI" seems to be coming along rather well. Watch this video, which was released eight days ago.[1] Can you find any flaws? Two years ago, just getting hands with the right number of fingers was tough. Last year, there were jarring errors in every scene. Now, very little is wrong. How much longer will anyone need Hollywood studios?
[1] https://www.youtube.com/watch?v=4zTCLIhScCM
It is a LOT better than 2 years ago, but there are flaws and its unpleasant to watch. The most easy to spot is their shoes (which they weren't wearing 1 second ago) flying off their feet without being kicked off in the first 10 seconds.
But if progress keeps going I'm sure it will get to the point where my brain doesn't feel sick after watching it. I hope so, because I'm sure there's a lot of AI videos in my future, whether I want them or not.
Yes, such systems are still struggling with continuity.
(There might be a workflow solution to that. Part of the system needs to do the job of what old films list as the "continuity girl". For each shot, there's a blocking diagram of who stands where at the beginning of the shot. There's a description of what each character is wearing, holding, or touching. If something generated that for the end of each shot, and it was fed into the prompt for the beginning of the next shot, that would help maintain continuity. This is another example of where a concrete mid-level abstraction is needed to keep things on track.)
Anyone have any idea what tool generated this? It's way past Stable Diffusion.
1 reply →
the most easy to spot is the camp fire reflection on the car :D
Still in the uncanny valley for me. Like watching AI the film. That said, it’s 3-minutes long and maintains the setting across many different angles, zoom levels, etc. pretty impressive.
And even if there weren’t any jarring errors, and rest assured there’s about a billion of them, there’s no appeal to this. It’s all context free short unassociated clips of pretty faces dancing on a beach. And?
There’s no narrative, there’s now sense of reality, it’s just a sense of here’s a million pixels of colours that have proven to go well with each other, it’s _slop_.
It’s been years and the only place AI has conquered in visual entertainment is as a subpar Photoshop replacement to fill in the B-roll gaps for those that don’t have the patience or money to do it the proper way.
How elitist of you to belittle 95% of all creators who are no better than that.
1 reply →
Well it is an ad, and all ads need to do is pump their "brand" into your head, so it was always slop.
>How much longer will anyone need Hollywood studios?
As long as they need shots longer than a few seconds, I suppose
>Now, very little is wrong.
You think culture is a matter of whether you've genned bass drums in front of or behind the drummer? (or to be a little more critical, whether you've remembered to add "sprinkle in some racial diversity" after the first prompt). You got the Coca-Cola trademark just popping up throughout, though, good job there
The video doesn't even have to load to know it's AI generated. The channel profile thumbnail and the video description are dead giveaways. The first frame of the video has too many errors to be worth repeating here. The first 0.5 seconds of the video has implausible movement.
I hear you. It is impressive technically but as far as finding flaws, I will just say this. This looks like something aliens would create in a dystopian simulation based on very odd understanding of old movies. I found it quite unsettling. Do you really think this would replace films with real actors and real writers (assuming they left the millennial talk and "modern audience" stuff)? I think the memes and parodies for AI video is more interesting than this kind of thing.
Every continuous shot lasts no more than five to ten seconds. It's not a "give-away" as such, but it's certainly a tell. r/aivideo is chockful of this crap.
The thing that’s always missing from videos like this is how much prompting or manual editing it took. It’s always implied that it was a one-shot, when it almost certainly was not.
It is way better than some years ago but like every scene got something strange. Look closely. Look at them throwing shoes at 55s if you want something really obvious.
> Can you find any flaws
Physics.
I think we'll know this is true when deepseek becomes illegal in the US.
I tried to sign up for deepseek API access directly from the company out of gratitude for the open source contribution (deepseek.com) but payments are blocked by US government rules.
The point of open weight models is not having to rely on specific providers for them.
For now open weight models are legal. Conceivably the government could claim that an open weight model "contained" forbidden information and simply ban it. We live in authoritarian times.
What's the likelihood that universities eventually become open model providers?
Zero, but update priors when you see campus football stadiums replaced with datacenters and gas turbines
When my dad got his Computer science degree at Queens University (Ontario) in the late 1970s, there was an entire faculty-sized building on campus with no windows and few doors. This wasn’t just the CompSci department, the building WAS the computer.
Universities are struggling to prevent their students using AI, because it makes both learning and evaluation extremely difficult.
This. It's extremely frustrating, AI can be a 24/7 tutor, but too many students use it to do their work instead.
We have to really rethink how and what we teach, and how we evaluate. Scoring (non-handwritten) homework is pointless, even contra-productive (because it incentivizes cheating, even for the students who don't want to, just to not be outscored by the cheaters). Hand-written homework means the students at least have to have read the work once...
And soon, with AI glasses, even in-classroom tests will be difficult.
2 replies →
Universities provide the AI to students. We have CoPilot in our 365 and the Ed majors get free AI crap all the time. Hard to make drugs illegal when you're the dealer
Way too expensive.
Oh. Thats an interesting concept. Expand
The uncomfortable implication is that "AI sovereignty" may end up being less about training your own GPT-class model and more about securing compute, energy, datacenter security and contractual access
Au contraire, I think we’ll soon see $10,000 self contained boxes (512 GB Mac Studio, basically) that you plug in to a LAN and have infinite inference.
People pay for cars and trucks (same cost scale) globally as they’re profit-generators for every user (as far as giving them access to jobs).
This is the same.
Yeah, so it's just business as usual: If you have ungodly amounts of money, you can essentially do anything, and if you don't, you can't. It's always been this way, and it'll always be this way. I don't see this as a world-ending issue.
The sovereignty part of that may have more to do with access to users' data and interactions by law-enforcement and intelligence surveillance.
It's the same as energy sovereignty.
Quote:
> “The two AI superpowers are going to start talking. We’re going to set up a protocol in terms of how do we go forward with best practices for AI to make sure nonstate actors don’t get a hold of these models,” Bessent told Joe Kernen on Thursday, on the sidelines of President Donald Trump’s two-day meeting in Beijing with Chinese President Xi Jinping.
https://www.cnbc.com/2026/05/14/us-china-ai-rules-bessent-us...
OpenAI is already talking openly about gated access to their models (see this OpenAI podcast episode for example: https://openai.com/podcast/#oai-podcast-episode-16)
Separately there's also a very active effort to stop open weight releases.
It's dangerous to those who think access to frontier intelligence is important.
Open-Source will handle access to models, someone will find a way. Security by obfuscation has never worked.
I wonder if the countries that don't have "AI Sovereignty" end up being like what Japan is now, technologically. It's stuck in 90's/early 2000's tech and norms (i.e. left behind) but its infrastructure and society chugs along (the demographic problem is a separate issue).
Would that make those countries more attractive to young people perhaps? As a place to grow and learn skills where the opportunities are non-existent in the AI Sovereign countries.
We should be aiming for less token usage, ideally none at all. The current AI is using LLMs to expanding horizontally but with the goal of achieving vertical progress - inventing truly new stuff and being able to eliminate our biggest problems. problems like cancer need only be solved once, and is no more tokens needed after that.
Think this somewhat underestimates economic pressures the US labs are under.
OpenAI etc need to make crazy revenue to get their investment math to work. Perhaps you can sell some tokens to privileged partners at a premium rate but I think they’ll need global scale ultimately
Instead of soon, how about just "now"?
I would imagine not single everyone on HN have enough disposable income that allow us to subscribe Claude Max or other similar max plan of other models without thinking.
Some people mentioned open weight model, but there are two hurdles. One the current economic mean securing the best hardware is already stupidly expensive compare to a year or two ago. And the open weight model lack the magic that Claude/Gemini/OpenAI put in the proprietary one, meaning one will have to create their own agent that is clever enough to search the internet when it knows its training data is stale.
> I would imagine not single everyone on HN have enough disposable income that allow us to subscribe Claude Max or other similar max plan of other models without thinking.
You don't need the Max plan with other models if you're not going completely crazy. Other providers have much more generous limits than Anthropic.
Its worse than that - all AI features will get broken down into even finer slices and you will have to pay for everything based on the finest level of slice they can make and still make money.
Physics and economics will drive cost. Current token pricing is based on unsustainable investment and energy cost. However, this is more of an optimization problem than an inherent show stopper. Token cost will inevitably come down over time. But this could take a while before it catches up with demand. Manufacturing will step up to provide cheaper GPUs. Etc. There will be some consolidation but the whole thing will converge on something that should make long term economical sense.
Ultimately it's a resource control issue. To power AI you need land/space (to build on), water, energy, and lots of hardware. Hardware needs to be manufactured and engineered. It needs metals, some exotic materials, machines, etc. More resources in other words. If you look at China vs US here, they are really well positioned in terms of resources and supply chains. The US has fallen behind quite a bit on energy and all the critical resources needed to produce hardware. AI is bottle necked on a lot of stuff that China has or makes in abundance.
For the frontier models, there are a growing number of companies and countries that provide them. We're used to mostly talking about the US ones. But of course the Chinese have a lot of capability here and they are not that far behind. And that's judging by the models they choose to release under OSS licenses. Those models are not their frontier models. And there are a lot of other countries developing and using models that aren't necessarily talking openly about what they are doing.
The irony with these frontier models is that they only generate revenue if people can use them. Why sink billions in AI infrastructure and models without a revenue model?
The reality with Mythos is that you have to assume that the Chinese (and others) are not that far behind and may already be running an equivalent model that they just haven't told anyone about yet. Anthropic gate keeping Mythos and its findings is probably wise. But it's not long term sustainable to depend on that happening or working very well. Or even on them even being a leader in this space.
This is becoming an arms race between countries, and economies. And it's an economical and resource control race. Developing and researching in the open has advanced things massively. But it has also empowered the rest of the world. Both Anthropic and OpenAI are staffed with people from all over the world. You have to assume that they probably aren't very good at keeping things secret.
Those billions in AI datacenter infrastructure will eventually be repurposed to run smart models like Mythos, not ChatGPT or even Opus/Sonnet. That future "revenue model" is quite robust to any foreseeable competition from on-prem FLOSS inference. It's a natural fit to the actual capabilities of large datacenter-scale compute.
Lmao delusion. Can’t you see the amount of assumptions you’re making that can be blasted apart by continual innovation?
All the downsides of your cliched agi nightmares but with the “intelligence” of your bog standard national security functionary
DeepSeek is not a distillation of Claude or ChatGPT - stating this is just idiotic politics at this point.
The Chinese labs have reached "escape velocity" long ago - they will continue development regardless of API access to US models or the willingness of US labs to share their research.
I'm not so sure about "soon" - the big labs are profiting from the discovery and experimentation efforts by independent contributors (openclaw, etc) and reducing their capabilities also reduces input from this side.
How much money are you all paying to use this tech? Last I even tried, it would cost my entire salary. Yet, everyone and their newborns are using it every day for everything. How is this possible?
Depends on which exact model we're talking about, and on your salary.
For example, with the $40/month Kimi Code subscription the limits are so generous that you can use it every day all the time for everything (basically just have an agent constantly running doing something) and never run out of tokens/hit the limits.
Probably a combination of:
- people living in places with higher cost of living and corresponding salaries
- people whose employers are paying for it
- people who aren't actually using it but are being paid to hype it up online
- bots
When intelligence is a commercial commodity, it is only bound to happen that the rich gatekeep it to secure their socioeconomic status.
But, I think, with every revolution, hierarchies have only historically fallen only for the former serfs to rise.
The industrial revolution, the renaissance -> all were marked by an massive shift in the socioeconomic status and the rise of the middle class.
I think AGI, when it happens, will only raise equality. I may be wrong.
It took almost two centuries for broad middle-class living standards to become common for large population after Industrial revolution - and that happened after intense fight for rights and fair share of economic gains.
So, sure, AGI might raise equality - but that's only if we fight for it.
I think so too. The rich will be richer, but also more people will have more at the same time. As Civilization put it: 'Just as it has always been'
I mean what if this is an inverse revolution?
Damn. I predicted this last year and got thrashed for it.
Glad to see others catching on.
> margins shrink and become razor-thin
You need to understand that these models are provided by the corporate entities, they are expensive to maintain, iterate and run. There is still no strong correlation between the use of AI and the business outcomes so there should be a real ceiling to how much enterprises would pay for tokens. The gov is a usual choice to establish contracts and get some stability, similar to building nuclear reactors or military equipment. And posturing about limiting model access is just saying it is expensive to subsidise its use for cat image generation or call summaries.
I am pretty sure we have not found the killer app (like an IDE even) for us to extract all the possible value from the models yet. I would even go as far as to say that the synthesis between a human and AI could leverage average models to achieve a lot more compared to the model/agent working on its own.
edit: Just to add to this, I am going through Mythos scans and it is not perfect, very much similar to what pentesters would do with the added bloat of noise in reports about nonissues.
I hope regular people will stop using "national security" and "national interests" as euphemisms and framing, and will call these things a psychopathic fight for power.
Assuming that some humans are worse than others because of their flag picture and that they deserve less access to resources is barbarism. There is no security in limiting access to NSA-style entities; it's an absolute insecurity for everyone but them throughout the whole world. How is that in anyone's "interests"?
We see every day now how suspicious bugs that look exactly like backdoors (i.e., Microsoft BitLocker) get exposed. That's in humanity's interests (and those of particular nations as a subset) — not being subjugated by small rings of professional outlaws. We need these instruments to defend people, everywhere. We don't need to give a leverage to any state psycho. Let's make everyone of them weaker.
If Amodei and the co. were in charge the models would alert the police if someone said "boob" and the goys would only get GPT 2 level models, hell, even that might be too dangerous.
> goys
I suspect this was just a throwaway word usage, but its usage here ends up being pretty anti-Semitic, so probably worth reconsidering its use if that wasn’t the intention of your post.
What part of what he said was false? Dario Amodei and especially Sam Altman have been treating the general public like cattle. And goy simply means non-Jew, how can not talking about Jews be anti-Semitic?!
3 replies →
> And it doesn’t stop with the security questions: the Trump administration’s signature style of international engagement is to wield American leverage as a bundle. Deadlocks in trade negotiations are broken by threatening to withhold intelligence, tech deals are stalled by reference to food safety standards. And so I don’t know when a U.S. administration would choose to leverage its seemingly inevitable predeployment authority over frontier models to secure its broader interests, but I’m sure it would in due time. That means that even if we do everything ‘right’ on the security and economic side, frontier access is still fundamentally contingent as long as there’ll be divergences between governments’ strategic interests.
The Trump Administration telling the very neo-fascist oligarchs who bought him an election and bought him a ballroom to play nice with their toys? At the expense of rampant capitalism? Lol.
He already showed us the limit of his comprehension of the topic when he made EO 14179 limiting states from regulating AI.
Trump doesn't swing for perfect pitches. He is a madman, a lunatic, and a true moron. Do not give this man any credit. I would be shocked if he could tell you the time on an analog clock.
[flagged]
You can be a greedy pig and be an idiot simultaneously. You can see how those two things might even be correlated, no?
I think “bought” here is to be read as “financed from”, not bought in the literal sense.
> to question
That's a weird way to characterize months of incessant "we have incontrovertible hard evidence but you can't see it yet" claims, which--when finally forced into the light--were laughed out of every court in the nation.
If it was just pure and innocent "questioning", things would be very different. We probably wouldn't have had the January 6th mob attack on Congress, for example.
Trump's second presidency is the best possible evidence that no one is driving the world in secret from behind the scenes.
I think we all know by now who he is really owned by.
[dead]
[flagged]
[dead]
[dead]
[flagged]
As someone who actively monitors the Chinese internet as well, I believe we are heading toward a world split into two distinct AI spheres.
Coming from South Korea—a nation outside the US-China dichotomy—the fundamental issue I see is the closed nature of the American AI ecosystem. Products like Gemini, GPT, and Claude are API and subscription-based, meaning their pricing and access terms can change at any moment. If that volatility increases, developers desperate to escape vendor lock-in will inevitably turn to local models.
Chinese open-source models like Qwen and DeepSeek are already exerting massive influence over our domestic AI ecosystem. While the US still revolves around CUDA, China has built its own CANN ecosystem. Most impressively, Chinese local models are incredibly accessible, even for a foreigner like me.
I believe that while the US will retain dominance over the cutting-edge frontier inside Silicon Valley, the logical ecosystem—the models that individuals can actually download, run, modify, and build upon—will increasingly be dictated by China. Closed American models may lead in absolute performance, but open Chinese models will act as the foundational anchor against price resistance. If US companies attempt excessive price hikes, these powerful open models will cap those increases.
This feels remarkably parallel to the history of Linux servers. Data centers chose Linux because, at scale, avoiding licensing costs, maintaining deployment control, and escaping vendor lock-in are critical. Windows Server still plays a role where vendor accountability and specific enterprise integrations are required, but in large-scale infrastructure, open systems overwhelmingly won.
We are likely to see the exact same phenomenon in AI. A n open local model doesn't have to be the absolute bset. If it is 'good enough,' cheap, easy to deploy, and free from volatile vendor pricing, it will become the core of the infrastructure layer.
If that happens, the foundational 'layer of thought' embedded in our systems might no longer be based on American cognitive frameworks, but on Chinese ones
I am certain that AI will be deeply integrated directly into our infrastructure. The reason is simple: spending time memorizing YAML syntax just to configure a CI/CD pipeline is a complete waste of time. Because of this, we will inevitably see a surge in services that orchestrate small, domain-specific agents tailored for these exact niches.
When that happens, are we really going to integrate expensive American model APIs to run them? Or will we just rent small GPU servers and spin up local models? I strongly believe the latter is far more likely.
Yes, CANN is that "emerging centralising entity" (from POV of Westerners)
I came up with a shorter summary of what you refer to as "my work"
To ensure that the externalities of standardisation are borne by the standard-proposer, eg.
Echoing your countryman https://archive.ph/2014.04.30-203815/http://www.theguardian....
But also Georges Clemenceau ("War is too important to leave to the generals")
[flagged]
So now AI is about apartheid. I am not liking this at all.
Stop using this word so lightheartedly. You have no clue what you are talking about, and ridicule millions of people who have suffered through decades of oppression.