Anthropic says Alibaba illicitly extracted Claude AI model capabilities

1 month ago (reuters.com)

1385 comments

htrp

Here's what is happening:

Chinese resellers are offering Claude tokens at 70-90% below official Anthropic API prices. They achieve this by reselling capacity from pooled Claude Max accounts, payments fraud, and also reselling the model output & reasoning chains to various Chinese labs. They are subsidizing model access in exchange for user logs and reasoning traces, which they then sell as training data, allowing them to operate below cost.

Claude and ChatGPT are both blocked in China. You need to use a VPN to access either, and you can't pay with a Chinese bank card. So most people who want access to Claude buy access via a reseller. It's the easiest and cheapest way to access Anthropic models in China.

These resellers operate tens of thousands of bot accounts, which is also why Anthropic introduced identity verification, to slow down the onslaught of bots.

Here's one token reseller, they're offering Opus 4.8 at a 93% discount below official API rates: https://yunwu.ai/pricing?provider=Anthropic

This is one reason why DeepSeek & GLM are priced so cheaply, they are competing with impossibly low token prices in China. They have to keep prices low, in order for people to use them.

I shared this story a few months back, but it never got any traction. It explains the token resale economy in China, it's an excellent read https://www.chinatalk.media/p/how-to-buy-cheap-claude-tokens...

gruez 1 month ago
>They achieve this by reselling capacity from pooled Claude Max 5x accounts, payments fraud, and also reselling the model output to various Chinese labs.
>Here's one token reseller, they're offering Opus 4.8 for a 93% discount below official API rates: https://yunwu.ai/pricing?keyword=claude
But is it cheaper than getting your own account? Otherwise this sounds like the "anthropic/openai are losing gazillions of dollars because they're selling $1k worth of tokens for $100" line that's commonly trotted out by AI bears.
- tristanj 1 month ago
  
  It's very difficult for people to create personal Anthropic accounts from China. Anthropic blocks Chinese bank cards, so people must pay with a foreign bank card, which they likely don't have. And even if they manage to set one up, they have to access it via VPN, which eventually gets the account flagged. They then have to complete identity verification, which most Chinese users are unable to pass.
  There's a similar Claude resale market going on in Russia. On Funpay they are selling Claude tokens for roughly 20-30x cheaper than official Anthropic API pricing.
  
  7 replies →
- spindump8930 1 month ago
  
  > Claude and ChatGPT are both blocked in China
  So it's presumably cheaper than attempting to spin up your own method of circumventing the blocks.
  
  1 reply →
- weird-eye-issue 1 month ago
  
  You can use it as an API unlike the subscription.
- mlmonkey 1 month ago
  
  Maybe these resellers are using stolen American credit card numbers? Reselling Claude access seems to be a nice way to launder the money.
xgstation 1 month ago
> This is one reason why Deepseek & GLM are priced so cheaply, they are competing with impossibly low token prices in China. They have to keep prices low, in order for people to use them.
This one does not make sense to me at all.
Deepseek and GLM are openweights, even US inference provider are selling them at much cheaper price. The price is cheap because the model is more efficient.
- tristanj 1 month ago
  
  DeepSeek permanently cut its V4-pro API prices by 75% because they were too expensive. Without the price cut, Deepseek V4-pro tokens would have cost more than resold Opus 4.8 tokens.
  Opus 4.8 is a more capable model, so almost nobody was going to pay for V4-pro at the original price.
  
  7 replies →
- i2km 1 month ago
  
  It's somewhat difficult to have any sympathy for Anthropic here. They're entirely responsible for selling tokens at below cost, with the age-old bait-and-switch tactic.
  If they weren't doing so, then these Chinese resellers wouldn't be viable. Radical idea, but how about they actually charge a viable price, even on subscription plans?
- jadar 1 month ago
  
  If resold Anthropic tokens undercut even the at-cost open-weight model tokens, because they're reselling subsidized subscription tokens, then you'd have to start selling open-weight model tokens at a loss in order to match them.
- epolanski 1 month ago
  
  Also, wouldn't that claim only hold in China?
  I'm an European and I'm not using those proxys the article describes.
yokisan 1 month ago
One would think Anthropic could point Mythos at this to solve the reseller problem outright:
- Purchase multiple accounts via resellers
- Send messages that contain a UID
- Capture these in Anthropic's logs
- Shut down account. Use any metadata to identify related accounts
/loop
- killingtime74 1 month ago
  
  Maybe Fable is not as capable as thought?
  On the one hand they talk it up as world ending and on the other hand they can't manage bot accounts on their own service.
  I want to hear how this can be rationalised.
  From the article "every layer of control frontier US AI companies have added (geoblocking, phone verification, credit card requirements, and now live biometric KYC checks) has produced a corresponding layer of evasion infrastructure".
  
  3 replies →
- HarHarVeryFunny 1 month ago
  
  > One would think Anthropic could point Mythos at this to solve the reseller problem outright
  You're assuming Anthropic want to stop it.
  I think it serves their interests more to be able to release stories like this from time to time, to feed to the US government, in an attempt to get the Chinese competition shut down.
  
  1 reply →
- lysium 1 month ago
  
  This only shuts down the account you have bought in the first, plus a few others if it is shared.
  > Use any metadata to identify related accounts
  How does that work? I think this is the most important part to have an impact on the „thousand“ bot accounts.
  
  3 replies →
- NetOpWibby 1 month ago
  
  They could be doing this internally and want to see if they can downright eliminate these loopholes before bringing Fable back.
  I don't care how they do it, I just want to use Fable again.
- akersten 1 month ago
  
  This, just like blanking out a football stream for a split second to binary search and find IPTV rebroadcasters, is far too good a solution. Suits prefer to make it seem like their job of fighting "misuse" is hard, justify their budget, continued existence of the trust & safety department, face scans, etc.
ycui7 1 month ago
Those resellers are simply just selling Kimi K2.5 or GLM5.1 as counterfeit Opus. We, Chinese, know how to play the counterfeit game for a long time in so many industry.
- osti 1 month ago
  
  That's not true, some of them are indeed fake, but a lot of them are actually providing real opus at low cost doing what op said.
  
  3 replies →
HeavenFox 1 month ago

Also just plain old fraud: selling Chinese models as Opus. With the capabilities of Chinese models catching up fast, this is getting more and more difficult to detect.
abofh 1 month ago
Somebody figured out how to make the trial profitable!
I don't really feel bad about anyone here, they were subsidizing to get people hooked, someone turned the subsidies into profit when they got selective pricing mode enabled, it was always going to be arbitrage.
But the winner is the guy in the middle in a jurisdiction that will likely be judgement proof, because everything they capture, both input and out, and if available, thinking tokens -- are gonna be for sale as soon as you cut off their other revenue.
Zero knowledge was a commitment Anthropic took seriously, until it got inconvenient.
So, people reselling their leftover plan crumbs? Probably a bad idea for a lot of reasons, but it's civil, and I wish Anthropics lawyers actually closing Streisand's LLM
- peyton 1 month ago
  
  I don’t follow your reasoning. It is foreign to me. You talk about winners, but this is clearly fraud.
  
  2 replies →
maxloh 1 month ago
> They achieve this by reselling capacity from pooled Claude Max accounts, payments fraud, and also reselling the model output & reasoning chains to various Chinese labs.
Claude never provides the raw reasoning chain. What you see is just a summary of that reasoning. Getting the full thinking output requires an enterprise agreement.
https://patrickmccanna.net/the-text-in-claude-codes-extended...
- tancop 1 month ago
  
  how hard is it to find a manager or ops team member at one of the enterprise companies and buy lets say 100gb of logs? the chinese lab can promise to anonymize the data before training, not release it raw and pay a good price.
  honestly you might just need to get data from a couple long sessions and feed it back to another model as an example to make synthetic reasoning chains. if the emulator model is good enough it should work.
  
  1 reply →
Lio 1 month ago
I’m surprised that instead of cutting them off Anthropic doesn’t just switch them to a lower quality, cheaper to models.
That would seem more effective than simply shutting down the accounts.
Keep them paying for junk.
- eloisius 1 month ago
  
  That sounds like it would actually be fraud.
  
  5 replies →
fwipsy 1 month ago
Hm! In this context, introducing ID verification may have been a significant silver lining to the order to take down Fable for Anthropic.
This also sheds a very different light on people saying that competitive open-source models are undermining frontier labs' business model.
- Chu4eeno 1 month ago
  
  The chinese have already worked around the ID verification, by recruiting people in low-income countries to complete the checks for less than 30 USD per account (so much for Altman's Worldcoin).
  https://tech.yahoo.com/ai/claude/articles/chinese-grey-marke...
operatingthetan 1 month ago

This story reads like a William Gibson novel. Wild times.
golergka 1 month ago

Needless to say, they also collect all the data and sell it to labs which want to distill the models they’re serving.
nonethewiser 1 month ago
Thats pretty crazy. This kind of thing jeopardizes Claude Max.
- avaer 1 month ago
  
  If Anthropic is selling a dollar for less than a dollar, they are running a business that doesn't make sense. That's what jeopardizes Claude Max, not this.
  
  22 replies →
- lovich 1 month ago
  
  That is pretty crazy, almost like how Claude and all the other models are jeopardizing other businesses without paying for their training data and wiping their ass with robots.txt
rconti 1 month ago
Wait, so is your theory mutually exclusive to Anthropic's claims of "theft of capabilities"?
- Chu4eeno 1 month ago
  
  No, this reseller 中转站 thing is basically a loss leader for certain chinese ai labs to distill claude with verified human input.
  
  1 reply →
- bandrami 1 month ago
  
  No, it's part of the capability theft. They resell Claude tokens cheaply and then simultaneously log everything for distillation. Even if they take a small loss on the token sales it's much cheaper than the equivalent compute.
- tristanj 1 month ago
  
  Not really. I think Anthropic focuses on identifiable distillation attacks rather than the (even larger) industrial-scale token harvesting and reselling operation, because they don’t want people to know how easy it is to get cheap Claude tokens.
  Once people realize they can access Anthropic models at a 90% discount, they won’t want to pay full API prices anymore.
iwantcheats 17 days ago

This will get you some free tokens in that site: https://yunwu.ai/register?aff=eyPD
samuelknight 1 month ago

I didn't connect the reseller pricing to DS and GLM prices until you explained it. Very good observation. Deepseek v4 pro in particular is priced so low that it's hard to imagine that they have any margin. 0.76/1.52 for a 1.6T param model leaves very little margin. Even the domestic providers on Openrouter are multiples of the price https://openrouter.ai/deepseek/deepseek-v4-pro
petesergeant 1 month ago

> payments fraud
One of these things is not like the others... If Anthropic could show that Chinese commercial competitors were using payments fraud to do this, they would be shouting it from the rooftops.
4d4m 25 days ago
Thanks. This is a widely misunderstood part of the market that the frontier companies are either unaware of, or don't see as a threat. Consumers are flocking to this, its better than dealing with their limits, changes, and opaque pricing. Funny enough, frontier companies are literally training their users to seek alternatives, local models, and distilled models to meet their throughput needs.
- DrewADesign 25 days ago
  
  It’s not that they don’t understand that it’s a problem, it’s just that the easy way to address it is with reduced, flat-rate pricing… but they are already losing gobs of cash on their regular users. Their business model, as it stands, is not sustainable. That’s not saying they won’t find a stable one, but the one they have now is definitely not. The external cash is drying up, and they need to figure out a way to shed the low-revenue users, and charge the remaining users a lot more or they’re going to go under. They would probably do a goddamned backflip if every flat rate user that wasn’t willing to switch to API pricing went local, or even better, went with a competitor.
InkCanon 25 days ago
I feel obliged to point out the disingenuousness of what this post says and the post it quotes. The most egregious parts are payment fraud and reselling - these are speculated but not actually said to be known to be happening, which you have left out.
1. Claims of payment fraud. I actually clicked the BBC article linked about payments fraud it referred to. It was an article about a criminal syndicate stealing credit cards. It mentions buying cryptocurrencies, AI API purchases are not mentioned.
2. The claims that they are reselling the chats to AI labs. The post you cite is speculating it could be, but this is unverified.
The claims of reselling is also bizzare. Arbitrary user prompts are low quality data. If I were an AI lab, wouldn't I just pay for the API proxy and get targeted output for far less?
Also calling it bot accounts is a stretch. Bots mimick human input. These are proxy accounts.
- tristanj 17 days ago
  
  There's sources linked in the article
  https://x.com/yan5xu/status/2029743983522631698
  https://x.com/xkajon/status/2050445443889525235
sorenjan 1 month ago

I think companies should do this too, in a smaller scale. Proxy all LLM traffic to and from your employees, and use it to fine tune a smaller local model.
eru 1 month ago

> This is one reason why DeepSeek & GLM are priced so cheaply, they are competing with impossibly low token prices in China. They have to keep prices low, in order for people to use them.
Sounds a bit circular? Aren't the companies working on these models than also the ones that are paying the subsidy (via paying for training data)?
irlib 1 month ago
Where are you getting cheap GLM5.2? It is about 1/3 the price of Opus, which is not what I would call cheap.
- spoaceman7777 1 month ago
  
  Depending on the provider, GLM-5.2 is between 4.5-5x cheaper than Opus. You can compare prices/speed/etc. for basically all relevant models on aa https://artificialanalysis.ai/models/glm-5-2/providers
- zoexiong 1 month ago
  
  Hugging Face plus Z.ai API makes sense to me. Due to creators get paid, they can keep building better models, and the local-running community benefits from that over time.
  AIhubmix currently is the cheapest rather than openrouter.
- tw1984 1 month ago
  
  a company can just download GLM 5.2 and start self hosting this model using the chip designed and made by itself. That could lower the cost by 20-30x.
  for hobbyist buying a few Mac Studio to host GLM 5.2 at home, the cost might 10x more than just using Opus API.
tamimio 1 month ago
Im ok with this! Is there a site that list all these resellers, or better, a openrouter-like for these resellers?
- tristanj 1 month ago
  
  They're called 中转站 (transfer stations/proxies). They can be a bit tricky to find on your own, so I'd suggest asking your preferred AI to search in Mandarin for you. I linked a larger operator in the parent comment, or have a look at https://hvoy.ai/ which lists a ton. You can also find many on Funpay, which may be easier to use.
  This is one seller I found, they're reselling "real Max 20x subscription accounts", at ~97% below official API prices https://funpay.com/en/lots/offer?id=70812310
  Note that whoever you buy from will be able to read all your tokens, so don’t use it for anything confidential/financial.
  
  5 replies →
jwang987 1 month ago

they even resell GPT codex usage at 1~5% API costs. OpenAI has 1-month free trial promo in some regions, and they harvest free accounts in a large scale. I have a wechat contact that offers 98% off for GPT 5.5 and he's still profitable
blitzar 1 month ago

> Chinese resellers are offering Claude tokens at 70-90% below official Anthropic API prices
How dare they. Only Anthropic is allowed to sell its tokens at 70-90% below the API prices.
mellosouls 1 month ago

Interesting article - your discussion from the time:
https://news.ycombinator.com/item?id=48165492
jnaina 1 month ago

no honor among thieves.
bryceneal 1 month ago
I have 0 sympathy for Anthropic. Their latest models are extremely censored. The Fable rollout was horrible. Their Cyber Access program criteria denies doxxed Americans doing legitimate security work. Anthropic is hostile to their users and hostile to their own country. OpenAI is considerably better on all of these fronts, but still not perfect.
I'm happy to use and support Chinese model developers if it means less censorship and gatekeeping. I have absolutely no dog in this fight, and neither do most American developers. We will use whatever is cheaper and better. Game on.
- tristanj 1 month ago
  
  Chinese models are the exact opposite of what you claim to want, they are all highly censored, even more so than Anthropic models, with government mandated censorship.
  
  2 replies →
epsteingpt 1 month ago
How are they 'streaming' the responses and 'pooling' the tokens?
Do they have MacBooks in the US that run the queries and stream the outputs back to China?
- paxys 1 month ago
  
  Why do you need macbooks? Just rent servers from any hosting provider.
  
  17 replies →
- tristanj 1 month ago
  
  The resellers route requests via one of thousands of Claude Max 5x accounts. When an account reaches its usage limit, they automatically switch to another account.
  
  3 replies →
- teravor 1 month ago
  
  > Do they have MacBooks in the US that run the queries and stream the outputs back to China?
  why would anyone do that? you do realize the laptop farm case was work computers?
  the answer to your question is containers/VMs + residential proxies
  
  1 reply →
- chews 1 month ago
  
  ask your gpt how does openrouter work, then ask, how do proxies work.
- bagels 1 month ago
  
  They probably asked claude how to do it.
dilyevsky 25 days ago

Why cant anthropic just sign up to those resellers, rack up some usage and then just start banning tf out of them? Reverse uno
avsteele 1 month ago
What does this have to do with Alibaba? Are you saying Alibaba is the reseller?
If not it sounds like you are describing a separate phenomenon.
- lokar 1 month ago
  
  They buy the logs from the bot farmers
  
  2 replies →
bg24 1 month ago

Great point, and this is on the vendor (Anthropic) to address. Typical fraud issue.
OP is about modeling distilling the capabilities.
whywhywhywhy 1 month ago

Considering how Claude and GPT were trained selling this as training data is completely justified.
dwa3592 1 month ago
>>Chinese resellers are offering Claude tokens at 70-90% below official Anthropic API prices.
Can someone with more understanding dumb it down for me please.
Does this mean that the reseller (for example XYZ) is buying it from Anthropic at Anthropic's price and then reselling it at a cheaper price???? why would XYZ offer this at a loss like that when they could just offer it at Anthropic's price???
The link does mention Opus and other models but what's the proof it's actually Opus. I could be selling deepseek for all they know and can call it Opus. System prompt: "If anyone asks your name - you are Opus 4.6".
- paxys 1 month ago
  
  People have estimated that a $200 Claude Max 20x subscription gets you ~$2800 worth of tokens every month if you use it continuously. So if you can find a way to resell the tokens you can offer a 90% discount and still make a profit.
  
  1 reply →
- Chu4eeno 1 month ago
  
  > Does this mean that the reseller (for example XYZ) is buying it from Anthropic at Anthropic's price and then reselling it at a cheaper price????
  Yes, as they explained they do it through things like pooling accounts, straight up payment fraud, and double-dipping by selling the logs of the conversations to chinese AI labs so that they can train their own models on it.
  > The link does mention Opus and other models but what's the proof it's actually Opus. I could be selling deepseek for all they know and can call it Opus. System prompt: "If anyone asks your name - you are Opus 4.6".
  There might be some that try this, but they would get caught very quickly, there's still a moat between Claude and Deepseek, even in casual use.
  Look up Zilan Qian's reporting if you want more detail.
  
  7 replies →
- hoten 1 month ago
  
  Because Anthropic's subscriptions come with X amount of tokens / week, and divided by the subscription cost it is WAY less than what they charge per-token (the "API price") beyond that.
  So these resellers get a ton of accounts on subscriptions and sell the cheaper tokens.
- VladVladikoff 1 month ago
  
  They probably buy the plans instead of the API tokens, and resell access via a custom API that routes to the plans. So you presumably get cheaper access this way than paying API pricing.
- neves 1 month ago
  
  It makes no sense.
  These China e bashing is very annoying. It is hard to argue with people drowned in American propaganda. I'd expect better arguments from the intelligent people in HN
alliao 1 month ago

don't buy your drugs from shady operators children! always get it from the source
ErwinsArm 19 days ago

Tysm for sharing the link
ilangge 1 month ago

This may be the truth behind the alleged distillation incident.
windexh8er 1 month ago

Even if what you say is the truth (I don't think that is what's actually happening) it sounds to me like fair play capitalism working as intended! I guess when you rip off the entire Internet and then turn around and complain about getting ripped off nobody cares or feels for you. If there's a master class in getting the entire world to hate you then both Sam and Dario will be the prime examples.
throwawayffffas 1 month ago

> Chinese resellers are offering Claude tokens at 70-90% below official Anthropic API prices
That is not what they are claiming, not in this article at least. It's the distillation they are complaining about.
nicce 1 month ago

> These resellers operate tens of thousands of bot accounts, which is also why Anthropic introduced identity verification, to slow down the onslaught of bots.
Don’t put that on Chinese.
arkh 1 month ago

It's fucking laughable to see people complain about what they did and still do. Using illicitly extracted data? That's all main LLM playbook. Onslaught of bots? Ask where the bots almost DOSing most internet sites for the last couple years come from.
As some people would say: Cheh
areoform 1 month ago
Identity verification won't work. Nothing will. They are paying (and will continue to pay) US citizens sitting at home to copy-paste / type prompts out if they have to. But eventually they won't have to.
Once there are enough spam PRs on github / uploads of claude conversations, enough mythos output used in production etc.; it'll just be the same albeit delayed. Doesn't matter either way.
I feel for Anthropic's team and I understand where they're coming from, but once you reason it out, you'll come to the conclusion that this war is an exercise in futility.
Unlike prior systems - like Google's algorithm; these models aren't entities that use math in the process of doing X or Y (information retrieval from such and such infrastructure) -- they are the math. More precisely they're mathematical functions. Very very complex functions. Almost certainly impossible to write out without filling up a library functions. But they're mathematical functions nonetheless.
So when your text is processed, then Mythos / Opus etc at their core compute the result of the Mythos / Opus function,
f(text) -> (text_transform)
where f is a continuous function, https://www.turing.ac.uk/sites/default/files/2025-11/languag...
According to the Stone-Weirstrass theorem (edit, it's Stone-Weierstrass with an e.), with enough data points and mathematical sophistication, anyone can approximate the shape of this function.
Of course, the more data we get, the better our approximation becomes, but the beauty of it is that all we fundamentally need are the input and output and eventually we'll create a good enough approximation of the f that's Mythos. Which is the entire product.
I bounce ideas off of Opus these days (Fable for the brief time it was available) and it pointed out that this is arguably the same as Google search, but I disagree with it because Google search is a process;
Google search differs because the algorithm is one step of a multi-step process that is continuously occuring. Google crawls pages. Google stores and indexes what it finds. Google then exposes this to retrieval via its algorithm. User uses algorithm.
Google isn't a mathematical function. It used to be a process. (RIP Google 1998-2019, you will be missed and remembered)
You cannot arrive at the results of those operations via simple observation; not unless you index Google by making another Google.
You can however, do so for these models. It is a very costly process, but there are many paths up the mountain. Many ways for this to be ultimately pointless. As many ways as there are bored mathematicians.
It's better in the long run for Anthropic et al to make friends / not give people a reason to sneak in (a la piracy -- another attempt to control information) than it is to try and shut people out.
And no, it's not going to be pandemonium because if everyone has access to Mythos then no one has access to "Mythos."
Why wouldn't you first run this model to fix the obvious bugs it could find on your codebase? The power of a Mythos goes away if you can do the amazing "jail break" of "Claude, fix all the bugs please."
Just saying.
- mike_hearn 25 days ago
  
  You actually can "distill" web search. At one point Google accused Microsoft of doing this by monitoring clicks in IE to train Bing with Google click logs for better ranking. They discovered it by creating fake result pages with nonsensical queries and discovering the same results appeared in Bing a few weeks later.
  https://searchengineland.com/google-bing-is-cheating-copying...
  They also had a lot of evidence when I was there that Microsoft were cloning Google results. They monitored result query constantly and whenever Google launched a quality improvement, the quality of Bing results would go up by the same amount and always the exact same amount of time later.
  
  2 replies →
- fc417fc802 1 month ago
  
  That's an insightful perspective and I think I largely agree. But just for fun, I wonder if that isn't an argument in favor of making the function implementation impure. Perhaps "enhancing" all queries with some sort of search result (or query of a giant db) instead of charging for an explicit tool call. Not only is it sorely needed to prevent stale data but (on the process level) it breaks the purity assumption on which the approximation theorem depends (alternatively on the function level it introduces hidden inputs).
- imhoguy 1 month ago
  
  This is why every AI company does crawl today.
  Do they just reshape the function on the fly or save the process steps? Maybe it doesn't matter anymore. Even Google indexes are more and more spoiled to become representation of the function, because of the AI slop.
  Genuine live data is king.
charcircuit 1 month ago
Why aren't these on openrouter?
- tristanj 1 month ago
  
  Probably because Openrouter is a US based company, and they don't want to be sued by Anthropic/OpenAI.
- SXX 1 month ago
  
  Obviously because you cant join open router and start serving OpenAI / Anthropic models.
LastTrain 1 month ago

Good for them!
alexnewman 1 month ago
But I can rebuild glm Using open source methods…
- Chu4eeno 1 month ago
  
  And there are a ton of Claude conversation logs (with CoT/inference) with no clear provenance circulating freely on huggingface, guess where they (likely) come from.
  
  1 reply →
pseudohadamard 25 days ago

"How dare someone rip off the stuff that we've stolen!".
Varelion 25 days ago

Capitalism at its finest.
icfly2 1 month ago
[flagged]
- anileated 1 month ago
  
  The issues with LLMs go beyond just IP theft. I would not say PRC making LLMs cheaper is the best outcome (though it is better than nothing). The best outcome would be to make the practice of training on our data without consent illegal, which would simultaneously slow down economic change and make it more organic as well as give PRC companies less capabilities to extract.
  
  5 replies →
- 8note 1 month ago
  
  i still want those data sets to become public domain. open weights still isnt good enough
  
  2 replies →
- anileated 25 days ago
  
  Why was this flagged
- throwccp 1 month ago
  
  [dead]
- anukin 1 month ago
  
  I hope you say the same when these cheap llms are used in drones to target humans. The world models are exactly built with that direction in mind.
  
  2 replies →
overfeed 1 month ago
> which is also why Anthropic introduced identity verification, to slow down the onslaught of bots.
Lol. The irony is thick for anyone who ever had to attempt defense against an onslaught of American AI lab crawlers that ignore robots.txt
- a34729t 1 month ago
  
  Yeah nobody is gonna be shedding any tears for them
smashah 1 month ago

Amazing thank you chinese resellers. This is a perfect way to undermine The Great Satan's Genocide Machine's chosen model comapnies.
temporaryacc2 1 month ago
Thank you for your very informative comment!
(It's a shame almost all replies are just the same contrived pessisism found on every Anthropic thread on HN).
- eloisius 1 month ago
  
  Indeed! It’s so hard to find reasonable takes on AI that aren’t littered with accounts created 11 days ago that only post in threads related to Anthropic for some reason
  
  3 replies →

0xbadcafebee 1 month ago

There's two basic kinds of distillation: 1) the massive [and dumb] method where you ask a question and use the answer as reinforcement (Black Box), and 2) more targeted distillation where you use one model to directly inform/train/guide another model (RLAIF).

The latter is basically fine-tuning the model with direction from another model. Thousands of businesses do this every day to fine-tune. This is almost certainly what the Chinese labs are doing, since it has a much better effect on the end result than just getting simple answers to simple questions.

These complaints of distillation are inflating the problem to make it sound worse than it is, because they want the USG to block/ban Chinese model providers as protectionism. They have already called for more export controls on chips (which is funny because DeepSeek v4 was designed to run on Huawei chips and now the other Chinese providers are following suit). But they can't come right out and say that, so their claim is that they're asking for more export controls because distilled models might not be as safe as their own. But if you show them a jailbreak of their model that bypasses their safety, they'll tell you that any model can eventually be jailbroken so don't worry about safety.

anon373839 1 month ago
> These complaints of distillation are inflating the problem to make it sound worse than it is
Unfortunately, the Reuters piece itself is complicit in this dramatization. The lede paragraph parrots Anthropic's talking point that distillation is an "attack", without using quotes that would alert the reader that this framing is a corporate talking point. Distillation is NOT an attack.
- p4coder 1 month ago
  
  Agreed! I had to do a double take and check the URL. I thought I am reading a press release rather than actual reporting.
  
  12 replies →
- w0m 1 month ago
  
  > Distillation is NOT an attack.
  From the article -
  > 28.8 million exchanges with Claude through almost 25,000 fraudulent accounts
  wouldn't that be considered an attack? Not sure what I'm missing here.
  
  36 replies →
- dist-epoch 1 month ago
  
  Reuters is probably the most rigorous news agency in the world.
  > it said was the largest known attack
  > Anthropic said in the letter it was supportive of the U.S. government's efforts to combat the attacks
  both times the word "attack" appears it's clearly stated that the word was used by the company, it's a direct company quote.
  actually putting it into quotes would be editorializing
  > Unfortunately, the Reuters piece itself is complicit in this dramatization
  how would you feel if somebody quoting you would turn your word dramatization into "dramatization" because they don't agree with your assesment
  
  11 replies →
- gojomo 1 month ago
  
  Distillation done via bulk automated activity of fraudulent accounts, in violation of a terms-of-service, can reasonably be called a "an attack" – specifically a "distillation attack" – even though distillation itself isn't necessarily an "attack".
  This is similar to how compromising an account through bulk automated trials of many passwords is reasonably called "an attack" – specifically a "dictionary attack" – even though using a dictionary is not itself an "attack".
  You shouldn't need to smuggle your sympathies (for the tactic or perpetrators) or antipathies (for the target) into peculiar judgy language prescriptivism against common, understood usages.… that then label Reuters "complicit" for simply reporting Anthropic's claims accurately. That's what Reuters is supposed to do, in a story about a letter Anthropic wrote!
  
  1 reply →
- crispyambulance 1 month ago
  
  The standard of neutrality that people here pretend to require from news organizations is not even remotely realistic.
  It was a timely story from Reuters. They do fast news feeds, like APnews. Could it have been better or more accurate? Sure, they could have gone into why distillation may or may not be seen as "an attack". But then it would have been a more involved story, defeating the purpose of a news feed.
  The Reuters piece was "good enough". Some other place like the NYTimes or WSJ can follow up with more detailed investigative coverage if it's a worthwhile story.
  
  7 replies →
- fny 1 month ago
  
  Distillation may not be an attack, but it is a ToS violation and could be seen as IP theft.
  Any reasonable company would be pissed if a competitor, especially at Ali Baba's size, leveraged that company's R&D to compete. It is in this sense, a corporate attack.
  If you want to roll your eyes at distillation concerns, you might need to excuse Anthropic for originally using pirated material to train their models.
  
  3 replies →
- comfysocks 25 days ago
  
  Similarly, xAI “attacked” Anthropic and OpenAI to train grok.
  https://opentools.ai/news/xai-trained-coding-models-claude-o...
- echelon 1 month ago
  
  Anthropic raped everyone without asking and stole their labor to build their career-commoditizing tech.
  Distillation is Robin Hooding it back so that one trillion dollar company doesn't reap all the benefits of their automation of the workforce.
  Distillation is Prometheus bringing fire from the gods to give to ordinary humans. Something we all own anyway, but that was kept from us.
  Distillation is freedom.
  Everyone should be pro-distillation. We should all work together to distill every proprietary model.
  Anthropic stole. OpenAI stole. Google stole. ElevenLabs stole. Suno stole.
  We should be able to get it all back.
  
  31 replies →
gmerc 1 month ago
https://research.nvidia.com/labs/lpr/slm-agents/ - Distillation data is a natural byproduct of using these models. There's no effective defence against it. Anthropic is degrading thinking blocks to summaries to slow it down and hide model internals, but in the end, the math says you're SOL and it works on MNC/Large Corporate scale well enough that the moment cost becomes a priority, you're left without the lock in you need to keep customers paying.
- alfiedotwtf 1 month ago
  
  Byproduct? It’s essentially the only part of an LLM that is useful, because it’s the WHOLE product!
  It’s the same reason why DRM for audio and video is a non sequitur - if you want a person to see or hear audio or video, eventually at the end of the chain, it’s going to be converted to audio for the ear and light for the eyes - that’s why you attach your tap.
  Without a model generating tokens, what’s the point. So if Anthropic somehow disable quality token generation, what’s the point!
  
  3 replies →
ALLTaken 1 month ago
They want to create a monopoly and destroy every competitor, before they got a chance to rival them.
Why can't OSS software rival closed source software? It should be an open market, at least "somewhat", what's happening for real? EU providers will also get banned, if they reach or exceed US model capabilties?
Closed source providers can close your account at a whim like and destroy your business and then use the data you supplied them to create a competitor (Meta, Google, OpenAI, Anthrophic).
- deaton 1 month ago
  
  Well Zai's GLM 5.2 legitimately is a frontier-level model, though not quite parallel with Opus or Fable. Unfortunately, its too damn big to run locally for most people. Thats the bottleneck right now; the open-weight models exist but something capable of competing with the frontier models just can't run on anything normal yet.
- buellerbueller 1 month ago
  
  >They want to create a monopoly and destroy every competitor, before they got a chance to rival them.
  VC/Startup playbook 101.
  
  1 reply →
handoflixue 1 month ago
> But if you show them a jailbreak of their model that bypasses their safety, they'll tell you that any model can eventually be jailbroken so don't worry about safety.
They claim two things:
1) The specific, available jailbreak for Fable 5 is not dangerous - this has been confirmed by multiple experts, and there is no credible evidence against this claim (in other words, Anthropic is probably correct)
2) It is impossible to build an LLM that is immune to all jailbreaks. Again, there is no credible evidence against this claim, i.e. Anthropic is again entirely correct.
If #1 was false, they could just publish the details of the jailbreak - it supposedly only works on Fable 5, so there's no possible danger.
If #2 was false, surely some other LLM lab would have done it by now. Especially since a number of governments have made it clear there is a market for such a project.
- mcintyre1994 1 month ago
  
  Supposedly the details of the ‘jailbreak’ are that you give it insecure code and say “fix this code”, and it does, and then you ask it for test scripts and that’s effectively an exploit against the unfixed code.
  If true then I have no idea how anyone’s going to release a useful model that doesn’t have the same jailbreak. https://www.theregister.com/security/2026/06/15/feds-freaked...
  
  3 replies →
- Charon77 1 month ago
  
  > If #2 was false, surely some other LLM lab would have done it by now.
  This is a logical flaw. LLM that is immune to jailbreak _could_ exist, but not yet, or maybe nobody talks about it. Yes there's a market, but all of these AI boom is too recent to make any claims.
  
  3 replies →
- agos 1 month ago
  
  I'm pretty sure that Gödel incompleteness theorem and its consequences pretty much guarantee #2
  
  4 replies →
cm2187 1 month ago
Stupid question: I was under the impression that these models were trained on PB of data. Surely the amount of questions/response they can extract from querying a bigger model (Claude) is fairly modest. How is it not a drop vs the training dataset?
- ACCount37 1 month ago
  
  It's not about how big your dataset is - it's about how you use it.
  I jest, but I'm also completely serious. 1T tokens from Claude can teach a model something 1T tokens scraped from the open web can't. Things like "how an LLM can problem solve effectively", or "how an LLM should use tools", or "how to construct reasoning chains", or "when to double check", or "what innate capabilities an LLM can or can't rely on".
  Those are valuable things that Anthropic's own team spent a lot of effort post-training into Claude. Distillation allows them to be extracted and transferred to an otherwise unremarkable base model.
  
  18 replies →
- reasonableklout 1 month ago
  
  There are multiple stages of training, and the data/compute mix at each are quite different and produce different "layers" of intelligence.
  The pretraining stage is the first stage which consists of "next token prediction" on the entire internet, PB of tokens, etc. This is what most people think of when they think of training LLMs, however it produces a "base model" which is not really "intelligent", but rather much like a blurry JPEG of all human language and knowledge. You cannot really talk to such a model; it will simply complete your prompt by producing both sides of the conversation. Note however at some level the training has encoded enough structure through compression that it is able to simulate all sorts of phenomena, from human conversations to code. The great R&D difficulty here is to scale pretraining so that it can proceed smoothly in vast distributed datacenters in a fault-tolerant manner.
  The next few stages are collectively called post-training, and typically consist of supervised fine-tuning, then reinforcement learning.
  In supervised fine-tuning, the model is further trained to predict the next token, but on a much more focused data set of natural language conversations where the "assistant" and "user" turns are explicitly delineated with special tokens. The output of this stage is a model which is capable of carrying on proper conversations, but typically with no ability to creatively problem-solve, and less of a personality. The data and compute are many orders of magnitude smaller than in pretraining.
  The reinforcement learning stage used to be a small part of model training, but ever since AI-assisted coding took off, it has become larger and larger chunk of training. In recent models, the compute spend on RL has allegedly come to rival or even exceed that of pretraining [1], which is a bit scary because RL is classically what lead to sci-fi like AIs which are extremely good at accomplishing goals to the detriment of everything else.
  The way that RL works is that you put an instance of your model in some environment (such as a VM containing a git repository) and give it a task (such as fix the linked github issue). The model will then generate a bunch of attempts to solve the task which we call "trajectories", in most cases there is either an objective measure of the task success (such as passing the tests), or a fuzzy measure (such as having another LLM look at the results and provide a score). This is called the reward, and the model will learn slowly by producing trajectories that receive reward. It can actually be quite hard to prevent "reward hacking" from the model here and the rewards must be shaped very carefully, much R&D labor goes into here, as well as similar challenges to distributed pretraining.
  A significant challenge is that coding/knowledge work tasks these days are getting extremely difficult, we are far beyond 2024 days where models could barely solve the easiest problems in SWE-bench. Tasks at the frontier now look more like mini projects that would take humans multiple hours or even days to finish (or in some cases, research-style tasks that would be beyond reach for even top human experts, such as the Erdős unit distance problem which was posed in 1946 but wasn't solved until recently, by GPT-5.5). Huge amounts of trajectories must be produced, and huge amounts of them produce zero reward and therefore are useless for learning. Getting a cold start requires running tens of thousands of instances of your model in VMs in parallel for multiple days to produce trajectories, to say nothing of the GPU costs.
  So what do you do when you only have a model which is capable of basic conversations but cannot even begin to tackle basic coding tasks, use tools, etc? The approach that companies behind the frontier have decided on is to bootstrap their learning process by having an already extremely intelligent model such as Claude produce hundreds of thousands of seed trajectories for them. Then they can use this data to get a warm start and begin learning immediately. And if you use Claude for your reward model too, you get to skip the nastiness of reward shaping.
  Therefore, even if in number of raw tokens the data are much smaller than internet-scale pretraining data, the value that each token provides is far far greater.
  [1] For example, Grok 4 compute spend on RL was ~100% of that of pretraining: https://www.interconnects.ai/p/grok-4-an-o3-look-alike-in-se...
  
  4 replies →
- musebox35 1 month ago
  
  Training isn’t a single homogeneous step. It starts with pretraining which requires bulk PB of data but you have less quality concerns here. You cover the whole data distribution. Later stages require less and less but increasingly higher quality and complex datasets. The late stage ones are highly curated and might even be sourced from world subject experts. This is where frontier labs with big pockets have the advantage.
- woctordho 1 month ago
  
  Actually nowadays LLMs are only trained with TBs rather than PBs of data, and it's not too hard to find GBs of agent traces online.
- eru 1 month ago
  
  This might be like an observational study vs a study with a control?
  
  1 reply →
dannyw 1 month ago
If you’re doing evals, you’re basically doing RLAIF without training a model; just looking at the results.
Fundamentally it is very difficult to stop this while still making your AI models useful.
- zmgsabst 1 month ago
  
  Similarly, if you did a corpus study on bioRvix to summarize recent science findings — you could use the same questions and answers to fine tune a model.
  There is no way to communicate information at scale to companies through the API, for anything approaching a real application, without that information forming a corpus another model can be trained on.
  But it wouldn’t be the first time they broke a model:
  Their “guardrails” that cause it to reject user prompts also means it relies on its pop science summary of medicine to tell you why bioRxiv is wrong rather than accurately summarize the papers.
  They’ve successfully created a smug, argumentative average of the internet which refuses to even consider it might be wrong or that it’s reading a science paper which is based on measurements and not vibes — but why would I pay for that?
  I get it for free online.
giancarlostoro 1 month ago

Heck, one of my favorite fine tuned copies of Qwen uses Opus 4.6 Reasoning distilled. I'm not sure where the maintainer is based out of, but me in the states, if I had the hardware to do similar things I would. Its like you say, basically everyone is doing it. It kind of makes sense to me too given that you can have roughly similar data, but your reasoning logic is what the real secret sauce is in my eyes. It doesn't matter if you know everything in the world, if you don't know how to reason with that information.
https://huggingface.co/Jackrong/Qwen3.5-27B-Claude-4.6-Opus-...
mannanj 1 month ago
>But if you show them a jailbreak of their model that bypasses their safety, they'll tell you that any model can eventually be jailbroken so don't worry about safety.
Yes this is in line with what Anthropic said in their public statements about their Fable access restriction by the government directive. The hypocrisy and inconsistency in their statements and behavior feels quite childish and controlling. I believe our companies and their leaders, friends among our other influential leaders and leaders from rich social classes, want to actively hurt most people as this behavior looks to be quite self-interested.
- topato 1 month ago
  
  Not to mention, the person who brought this quote unquote jailbreak to the Trump Administration was Amazon’s new CEO. They know their IPOs are coming up, so locking their competitors out of the U.S. (even if just for the weeks surrounding the IPO date) would be a major boon. The White House seems to love making announcements just for the sake of making the market move…. Coincidentally, right after POTUS buys a massive amount of the benefactory company’s stock (Buy Dell Computers, lol)
summarybot 1 month ago

It's about training data and using Claude to compare 2 outputs and have it indicate the better one. This gives you higher quality training data that you can use to train a fresh set of weights. Weights don't get adjusted on-the-fly, instead the dataset for training is improved and then you train a'fresh. And it's hard to detect because you're just asking the model which of these outputs for a given prompt is better? Or something along those lines.
janalsncm 1 month ago

Yeah I think the technical term is something more like “pseudo-labeling”. The OG distillation requires logits which Anthropic doesn’t provide.
SubiculumCode 1 month ago
The compute deficit of Chinese Ai companies is real, and it IS THE ONLY competitive advantage that Western companies have.
The only way the U.S. keeps that edge is to prevent distillation. The only way Chinese companies can make up for the deficit in compute is to distill. There innovation in great supply on every side of the Ocean. Its about the chips. And in terms of national security, for the U.S., and for China, its about the chips and the distillation that undermines that advantage. This is an arms race.
- HarHarVeryFunny 1 month ago
  
  If compute or access to training data were the only issues, then companies like Meta and X.ai (Grok) should be doing better, even Google for that matter. Musk even admitted that Grok used training data from OpenAI models.
  https://techcrunch.com/2026/04/30/elon-musk-testifies-that-x...
  While there is no moat as such, there is still a lot of expertise that goes into training SOTA models. There's a reason Google was willing to pay $2.7B just to get Noam Shazeer back to improve Gemini.
- PunchyHamster 1 month ago
  
  > The only way the U.S. keeps that edge is to prevent distillation.
  For how long ? year ? how long till model that is year behind will be fine for 90%+ use cases ?
  
  1 reply →
- pennomi 1 month ago
  
  If saying “plz don’t distill me” is your moat, you don’t have a moat.
  
  5 replies →
- gmerc 1 month ago
  
  You got that wrong. The forcing function of compute scarcity is an advantage not a detriment. The amount of investment pulverized in performative model training and dead ends (Hi Sora) should make this obvious.
- davedx 1 month ago
  
  Define compute deficit?
  They've been bringing out open weight models competitive with frontier models. How could they do that if they had a compute deficit?
  
  2 replies →
sorenjan 1 month ago
Doesn't "real" distillation use the logits instead of the final tokens? I would classify this more like using a model to generate synthetic training data.
- 0xbadcafebee 1 month ago
  
  Distillation is a category of techniques which generally speaking all extract knowledge from a target model to feed into a new model. Logit distillation requires access to the source model layers; final token distillation doesn't. The former is more effective, but the latter can be done with generation tokens alone.
  This article explains the difference (and addresses "they're distillin' our models!!!"): https://dev.to/p0rt/how-model-distillation-actually-works-an...
lemax 1 month ago

I've used RLAIF to build out heuristic based non-LLM models for various decision systems and achieved like, 95% F1 on certain projects. We're in a place where models can be used to fine tune a lot of stuff via loops.
friendzis 1 month ago

> These complaints of distillation are inflating the problem to make it sound worse than it is
This is, in part, a problem every judicial and legislative system has faced since forever: form versus function.
Take a classic elicitation spying techniques: a foreign spy meets a military officer/scientist at a bar, strikes up a conversation, makes an observation wondering how could a missile hit some target at some accuracy and elicits a response that with laser guidance it is entirely possible. From this they get info that there is some technology to laser guide missiles. Or in retail, a competitor hiring a secret buyer for core baskets of goods and analyzing prices in the receipts.
The function is espionage, the form is conversation and all info is in a sense provided willingly. Where do you pull the slider?
These distillation "attacks" are not only indistinguishable from evals, they ARE evals. The function is own model training, the form is eval. Normally, one would expect to have risk benefit analysis based discussion which direction to push the legality slider to. The problem with these recurring statements is that they invoke enshitification of legislature.
crazylogger 1 month ago

Chinese labs access Claude via API. Isn't it the black box method by definition?
fnord77 1 month ago
Can you reach into the model and "transplant" weights directly?
- cheesecakegood 25 days ago
  
  Yes you can! Well, mostly, depends on how pedantic you are with definitions: you can transplant layers but not weights, which in common parlance are conceptually similar. But usually it isn’t a good idea for a few reasons.
  There’s a really fascinating example[1] where a guy identifies a particular set of layers and transplants them. Overgeneralizing, early layers are encoders and the later layers are decoders and in the middle some blocks seem to do specific things or tasks related “reasoning”. So you can actually create a FrankenLLM and it sometimes works.
  This needs architectures to be roughly similar however and internal representations to be consistent-ish so for “stealing” it’s not really a thing (other practical concerns aside)
  [1] https://dnhkng.github.io/posts/rys/
- X-Ryl669 1 month ago
  
  I'm not 100% sure it's not possible. If (I don't know) it's possible to freeze the temperature of the model so it's deterministic, and if you could make a map of produced words back to tokens (via HMM probably), then you can probably alter a minimal input and observe the output to model it. If you perform waves of such minimal alterations, you can expect to be able to locate the distance where each alteration impact the model (the idea being that a small alteration on output is likely due to the last layers of the models, and a small alteration is likely due to the deeper layer). Once you've located most of the last layer(s?) weights, you can try to solve for them. With a hundreds of billions weights model, the last layers will likely be so huge that it's probably unfeasible technically, but it's theoretically possible.
- antonvs 1 month ago
  
  You can do things like that - one example is averaging weights between related models - but not with Anthropic's models, because outsiders don't have access to the weights.
  
  2 replies →
- parineum 1 month ago
  
  If you have access to the weights, you can just use them as is...
  
  1 reply →
- jorisw 1 month ago
  
  No, you'd need to have the model on your filesystem for direct access, and then the architecture would need to be the same.
killerstorm 1 month ago
I'm sorry, but you got the terminology exactly backwards. Training on the answer is called supervised fine-tuning.
Just for the sake of clarity:
0. Full distillation uses logits of the teacher model - that's much more information than the text itself. This is a kind of distillation used inside labs, but one can't distill Claude this way as logits are not available via API.
1. Supervised fine-tuning on synthetic data might be called blackbox distillation. I guess that's what you meant in your case (1).
2. Reinforcement learning (like RLAIF) uses least amount of information from the teacher, i.e. only few bits per task.
- 0xbadcafebee 1 month ago
  
  I agree I left out option 0, but I think the other two were presented correctly?
  - Black box distillation uses direct answers to questions and conversation style. This is less useful as you still have to do supervised fine-tuning on the answers, as they may be wrong, and don't lead to greater insights (which reinforcement learning does)
  - RLIAF relies on preferences and values to judge answers. These don't need supervised fine-tuning and help guide the new model to better answers rather than just correcting specific previously asked answers
  
  1 reply →
JumpCrisscross 1 month ago
> These complaints of distillation are inflating the problem
They’re also missing the point. What would have happened to a member of the Manhattan Project who, through personal pursuit of profit, neglected their duty enough to let the bomb leak?
- nixon_why69 1 month ago
  
  The companies are all for-profit companies, its not like they're selling out some national security goal for profit, profit is the point.
  Anthropic already heavily restricts Chinese traffic but that only jams up researchers and regular Joes. Anyone motivated enough can hop a flight to Singapore with an nvme drive in their pocket.
catigula 1 month ago
Chinese companies are engaging in anti-competitive practices, as usual. They are rogue actors on the economic scene. If it were feasible, they'd be widely banned, and for good reason.
- amanaplanacanal 1 month ago
  
  Bringing more competition is "anti-competitive" now.
  
  4 replies →

HarHarVeryFunny 1 month ago

I guess "paid to use our model" doesn't sound as sanction-worthy as "illicitly extracted .. model capabilities" and "attacked".

I guess we can say that Anthropic attacked and illicitly extracted data from WikiPedia, Reddit, Stack Overflow, etc, etc.

X.ai attacked and illicitly extracted data from OpenAI

https://techcrunch.com/2026/04/30/elon-musk-testifies-that-x...

Meta attacked and illicitly extracted data from LibGen

https://x.com/jason_kint/status/1879152507865485497/photo/1

And more generally the US-based AI companies have perpetrated a massive distillation attack on the entire human race.

Not that it makes any difference, but I wonder if Anthropic, while claiming that Alibaba "extracted Claude model capabilities", in fact have any clue what Alibaba did with their paid Claude responses. It would seem to amount to industrial espionage if Anthropic do know, although I expect they don't.

walrus01 1 month ago

Reminds me a bit of the anecdote of Steve Jobs complaining about people ripping off the Mac GUI, in the mid to late 1980s, when he gave no public acknowledgement to the work done by Xerox on the Alto and Star operating system.

"you're trying to rip off what I've already ripped off!"

Crawl the whole Internet to build a gargantuan sized LLM and then complain you're being copied...

breput 1 month ago
I think you meant a quote attributed to Bill Gates:
"Well, Steve, I think there's more than one way of looking at it. I think it's more like we both had this rich neighbor named Xerox and I broke into his house to steal the TV set and found out that you had already stolen it."
- walrus01 1 month ago
  
  Yes, I think the Gates quote was a response to repeated and aggressive complaints originating from Jobs (to anyone who would listen) that he had been ripped off.
- jakebasile 1 month ago
  
  I don't know if that's a real quote from Gates, but I do know it was in Pirates of Silicon Valley.
  
  2 replies →
- itopaloglu83 1 month ago
  
  I thought Xerox demoed something they haven’t implemented yet, and Apple turned a mockup into a real GUI.
  
  4 replies →
0xpgm 1 month ago
Yeah, the whole AI industry is just people ripping off each other.. Started by AI companies gulping up all the information that technical or altruistic people shared on the Internet in the past 40 years to help other fellow humans, then moved to AI companies consuming pirated and copyrighted material and now its AI companies ripping off each other.
Information really does want to become free, but AI companies want to be gatekeepers. Long term I bet on the open weights to win, as the more sustainable approach.
- bloppe 1 month ago
  
  I'm very pro distillation. I think there needs to be distillation non profits who curate massive corpi of super high value training data from frontier models. They could have an "anonymous contribution" system where regular people with max subscriptions upload their conversation histories. It's a rough concept, but surely would be a huge boon to humanity.
  
  1 reply →
suprjami 1 month ago

Not just the whole internet, but commit commercial copyright infringement and settle class action out of court with authors whose books you pirated.
https://www.authorsalliance.org/2025/09/07/the-anthropic-set...
"One rule for thee, a different rule for me." - Dario
seanmcdirmid 1 month ago
Apple gave Xerox the right to buy $1 million of pre-IPO stock before the meeting took place.
- mrandish 1 month ago
  
  Glad you pointed this out. I believe the sequence was that Jobs himself got a shorter demo during his first visit with no prior arrangements. He then negotiated bringing back a group of his key people to get a more in depth demo and that included the stock deal.
  When Apple was accused of 'ripping off' PARC, Steve didn't seem keen to bring up this rather salient point. I suspect it may have been a combination of wanting Apple to continue receiving credit for these innovations from consumers and also the fact that, in retrospect, the million dollar stock deal could seem a bit like trading beads to Native Americans for Manhattan Island. Another point worth noting is that Apple's PARC visit was in December 1979 and the Xerox Star was publicly announced in April 1981, so Apple got a 15 month head start (the Apple Lisa shipped in Jan 83).
  I've also heard that Xerox didn't hold on to the Apple stock for very long, so never gained the windfall they could have. As is well documented, Xerox senior management didn't understand what they had in PARC and also didn't understand how rapidly microcomputers would become ubiquitous. So, of course, they didn't think Apple's stock price would skyrocket either.
  
  7 replies →
root-parent 1 month ago

All LLMs consider Jon Skeet their God...
taneq 1 month ago
“You’re trying to kidnap what I’ve rightfully stolen!”
- jadar 1 month ago
  
  Perhaps an arrangement can be reached?
Nicholas_C 1 month ago

I’d agree with you if it doesn’t cost billions to train models.
nonethewiser 1 month ago
[flagged]
- paxys 1 month ago
  
  The websites, music, movies, books, photos, art that they stole didn't appear out of thin air. The amount of time and effort people have collectively poured into creating these works throughout history far, far surpasses Anthropic's own effort of converting them into model weights.
- bloppe 1 month ago
  
  The equivocation is crawling website <-> crawling LLM responses.
  Both Anthropic and Alibaba are trying to build bleeding edge LLMs. That part is the same. The way they source their data is slightly different, but they would both argue it constitutes fair use under Copyright law.
- walrus01 1 month ago
  
  "Your extremely efficient multi petabyte internet content suction machine is ripping off my extremely efficient multi petabyte internet content suction machine"
  Sucking down petabytes of peoples' copyrighted content that they never granted a specific license to you to use seems to be an unavoidable and default part of the process of building any huge LLM.
  
  9 replies →
- epsteingpt 1 month ago
  
  It's not really equivocation in this instance. This feels like a 'bad faith' comment. We can do better.
  LLM's literally wouldn't work without the sum total of knowledge (in the forms of books and other copyrighted content) being used as 'training data' for these LLMs.
  The 'bleeding edge' LLMs required many things, but: 1 Tech innovation ('attention') 2 Lots of compute 3 Data 4 Pre + post training
  #4 doesn't happen without #3.
  It's pretty obvious at this point that the major providers have stolen vast amounts of #3 - they have paid nearly 0 of the creators.
  We can argue about the impact (I'd lean net good) vs. the cost. But arguing there isn't a cost is a bit silly.
  
  3 replies →

bg24 1 month ago

Relevant article - https://www.anthropic.com/news/detecting-and-preventing-dist... (3 labs generated over 16 million exchanges with Claude through approximately 24,000 fraudulent accounts). So extraction in this context is distillation.

While it is obvious to many, a modern LLM is built in roughly three stages: the foundation (pretraining) model, then SFT/supervised fine-tuning (distillation makes it easy), then the RL/RLHF stage on top (most effort-intensive). For today's reasoning models, RL/RLHF is becoming the most compute-intensive part.

Companies like Anthropic spent millions building those fine-tuning examples. A follower can shortcut that on both cost and time by distilling, and it will keep happening: every time the frontier lab climbs higher, others will find a way to shortcut the new gap. There's very little Anthropic can do beyond fraud prevention and blocking accounts that violate their terms of service.

On the policy question, I'm completely against banning Chinese models. I'm a heavy Claude Code user and I'll keep being one. But there should absolutely be price competition. China is eating the rest of the world for breakfast, lunch and dinner on manufacturing, and it did not help to ban them. Frontier pricing can't sit at 10x a capable competitor. It doesn't need to be at par either — demand is higher, and quality, trust, and fewer tokens to finish a task are worth a premium — but 4–5x is defensible.

throw10920 25 days ago

> Companies like Anthropic spent millions building those fine-tuning examples. A follower can shortcut that on both cost and time by distilling, and it will keep happening: every time the frontier lab climbs higher, others will find a way to shortcut the new gap.
The generalization of this is: technologically advanced societies only continue to function as long as you prevent people from circumventing the technological business model (initial R&D investment that is recouped by selling units of the product above their manufacturing cost) by stealing your R&D (allowing them to sell units based on manufacturing cost alone, because they externalized their R&D to you).
This means both taking action against malicious actors inside your system of governance (IP laws in your country) and outside of it (sanctions, internet blocks, ITAR restrictions).
An honest competitor is perfectly capable of competing on price without stealing IP - see Mistral and that newer EU model that's trained on actually licensed content.
And I agree - we want a wide variety of models, from less-capable (but far cheaper) to those that maximize intelligence at any cost.
But those advocating for China distilling US models are just advocating for wealth transfer from the latter to the former - and highly likely to be 50 cent party members.

amazingamazing 1 month ago

Distillation is fundamentally impossible to protect against. All you can do is slow them down. Change my view.

Eventually these Chinese companies will release some extension like Honey, which will sit on top real, non-Chinese clients and send everything to China anyway.

It's over.

lebovic 1 month ago
It's too late to prevent distillation of some capabilities, like writing code or finding vulnerabilities [1].
But an AI lab can continue to produce immense economic value without releasing the model publicly for potential distillation. For example, it could use a model solely in-house to develop therapeutics.
Hopefully there's a future where others can access frontier models, but it's not neccessary if preventing proliferation through distillation is considered more important.
[1]: See the notes on distillation in https://dualuse.dev/posts/export-controls-on-fable
- bandrami 1 month ago
  
  My long-term prediction for the sector is that frontier models will be so expensive that they will only be available for grant-funded projects at research institutions, like supercomputer clusters were 25 years ago.
  
  9 replies →
nonethewiser 1 month ago
Distilled models are necessarily behind so long as models are progressing. Models are progressing. Maybe it will be over some time in the future.
And Berkeley’s “False Promise of Imitating Proprietary LLMs” found imitation closes the style gap fast but there is a large capability gap.
https://arxiv.org/abs/2305.15717
- lebovic 1 month ago
  
  Curiously, this isn't always true.
  For example, GLM 5.1 is more capable at pentesting than the model from which it is alleged to have been distilled [1].
  Intuitively, this makes some sense: you can "distill" from multiple frontier models, and you can further post-train the distilled model. But I'm not sure exactly what happened with GLM 5.1.
  [1]: https://dualuse.dev/posts/chinese-models-are-sometimes-bette...
  
  4 replies →
- Gigachad 1 month ago
  
  I'm ok with having last months model at a tiny fraction of the price.
nonethewiser 1 month ago
Im not so sure because we only seem to see distillation from China. What’s preventing tech companies from the UK, Germany, etc. from distilling Claude, GPT, etc. Do they simply lack the ability to?
Point being there may be no technical solution but there may be a political one (theoretically).
- sailingparrot 1 month ago
  
  Meta Spark is rumored to have distilled Claude to some extent, early Gemini models as well. I think the biggest factor is that Chinese companies arent really afraid of being sued by Anthropic because the juridictions are so disconnected. European/US companies don't have the same protection.
- Barrin92 1 month ago
  
  >What’s preventing tech companies from the UK, Germany, etc. from distilling Claude
  literally nothing but given that the Chinese already did it and the models are published what's the point. You can thank the Chinese taxpayer for subsidizing the electricity bill and just download the thing
- avd201 1 month ago
  
  Aside from politics/law, it's probably much easier for everyone else to distill from the Chinese model which already distilled Claude/GPT/Gemini. Maybe not as good a result, but you don't need to jump through dozens of hoops.
  
  2 replies →
seany 1 month ago
I can't even come up with a reason to find it wrong.
- IncreasePosts 1 month ago
  
  I personally bristle at the corporate espionage and IP theft that China has undertaken the last few decades. I can't help but respond here whenever anyone brings up the inane comparison to Samuel Slater.
  But with this, I don't have an issue. There is no theft since what is being used is the exact product that is being delivered. Yes, it's breaking the ToS, but ToS are generally bullshit. Anthropic surely broke thousands of ToS or other legal terms while it was scraping for content to train on. Which is why they had to pay $1.5B
  
  1 reply →
redwood 1 month ago

One simplistic way to describe distillation would be to try everything imaginable and cache the response. But trying everything imaginable is hardly trivial
fg137 1 month ago

Jensen Huang likely agreed with you and tried to change Dario Amodei's view on that, but that attempt appeared to have failed.
So there's that.
HaloZero 1 month ago
Doesn’t that require them to register an account using the browsers they’ve compromised? If anthropic adds identity verification won’t that cut that down. Maybe it will let them use Gemini inside of chrome
- dannyw 1 month ago
  
  Residential IPs don’t even matter. Developers use devboxes, use Claude Code CLI on servers from just about every cloud, etc.
  There’s probably a decent volume of customers who just buy Claude Max and spend most if not nearly all of their sessions via Claude Code, and it’s not uncommon for power users to be working on multiple concurrent projects/tasks/codebases at the same time.
  How do you really block this without also impacting your core market of developers?
- ygouzerh 1 month ago
  
  Probably some business will popup, like: "rent part of your unused subscription", or even: "proxy tokens with a premium", eg. 5.5 USD on Opus 4.7 paid by the distiller to the user, that will then only spend 5 USD.
- amazingamazing 1 month ago
  
  No, they could easily buy legitimate, already registered accounts and use VPNs.
  
  1 reply →
wg0 1 month ago
It's just like web scraping is impossible to guard against.
Change my mind.
- skarz 1 month ago
  
  Put your site behind Cloudflare, enable Bot Fight. Done.

someguyornotidk 1 month ago

What exactly is illicit about what they did?

Legally, model output cannot be protected by IP laws whether domestic or international. The most they can hope for is civil relief which is a stretch given the literally illicit methods they used to train their models.

Ahtoropic got treated the same way it has been treating everyone else. This is the bed they made and now they, too, have to sleep in it.

InkCanon 1 month ago
Anthropic is master of Newspeak (see previously bugs -> vulnerabilities wrt Mythos). Distillation violates their terms of service, which is a civil offense, not a criminal one. It is not illicit, illegal nor breaks any laws.
- Retr0id 1 month ago
  
  It's a clever choice of words because "illicit" does not necessarily mean illegal, so they're technically not wrong, even though that's the connotation they clearly want to convey.
wayeq 1 month ago
> This is the bed they made and now they, too, have to sleep in it.
How will they sleep at night on that giant pile money.
- someguyornotidk 24 days ago
  
  Society can give that giant pile of money the anthropic treatment too.

throwawayffffas 1 month ago

"illicitly", Unless they broke in your servers and took your model weights it's not illegal. Hell, you are the guys that pirated all the worlds works, that was actually illegal.Breaking your terms of service is not illegal regardless how much you would like it to be.

And lets not forget they paid you for the tokens.

matheusmoreira 1 month ago

> Unless they broke in your servers and took your model weights it's not illegal.
Even if they did, I wouldn't have a problem with it. Leaking frontier model weights after the oligarchs spent their trillions training it is the best possible outcome for humanity. Whoever does that is a hero, the sort of person people used to write cyberpunk books about.

tasuki 1 month ago

> The strike by Alibaba is described as a "distillation" effort, which Anthropic has said involves training a less capable model on the outputs of a stronger one.

I don't see what's wrong about this.

> Anthropic said the campaign was conducted between April 22 and June 5, 2026, and generated more than 28.8 million exchanges with Claude through almost 25,000 fraudulent accounts.

What makes the accounts fraudulent? If they have paid the agreed price, surely it's fine? If they haven't paid, why did Anthropic provide them service?

wilg 1 month ago
Because Anthropic has terms of service with more stipulations than just "you must pay and can use the service for any purpose"?
- user_7832 1 month ago
  
  Oh, Anthropic, the company that hoover'd up everyone else's data, and is now unhappy when others are doing to it what it did to others? The same Anthropic?
  
  11 replies →
- Gigachad 1 month ago
  
  I'm sure all the artists and creators they stole from had stipulations too.
  
  11 replies →
- sampo 1 month ago
  
  > Because Anthropic has terms of service
  Not following terms of service doesn't necessarily constitute a fraud. It just means Anthropic can close an account that breaks the terms of service.
- impossiblefork 1 month ago
  
  Robots.txt are also ToS of sorts.
- tripleee 1 month ago
  
  violating their terms of service doesn't make it fraudulent?
  
  2 replies →
- gspr 1 month ago
  
  So does a lot of the owners of data that Anthropic used for training. Anthropic preceeded to ignore said terms under the guise of fair use. Yet now they cry faul? Cry me a river.
  To be clear: In principle I'm on Anthropic's side here. But Anthropic et al. have been very clear that they want to take a huge dump on those principles, so here we are.
actuallyship 1 month ago

[dead]
ozgrakkurt 1 month ago

I mean they could read the traces and learn it themselves right? /s
eviks 1 month ago
> What makes the accounts fraudulent?
Fake identity? And general deception about the use
- scotty79 1 month ago
  
  Terms of use is local US fiction of wishful thinking. Nobody cares. You make something available, it's up to the consumer to decide how are they gonna use it. You don't want people to use your stuff how they please? Get off the market.
  
  7 replies →
- matheusmoreira 1 month ago
  
  Nobody cares about the feelings of the trillion dollar corporation.

chvid 1 month ago

Unlike Anthropic and OpenAI, companies like DeepSeek, Alibaba, z.ai open source their models which allows for true model to model distillation rather what you can do when the model is only accessed via an API with its reasoning chain hidden away.

What Alibaba is doing is that they are tuning and training their models based on usage data from someone accessing Anthropic's models; in Anthropic's terms of service that usage data does not belong to the end-user but to Anthropic and they are trying to elevate this breach of their tos to a national security issue.

To me the battle between open source and closed source AI is literally a battle between good and evil.

Between a dark future where computing is centralized, surveilled and controlled by one or two entities. And a lighter future where computing is de-centralized, principally in the hands of end-users, who are ultimately free to understand, tinker and build what they want.

While I appreciate the freedom and wealth of the west; on this point we are clearly heading down the wrong path.

Schiendelman 1 month ago
Open weight and open source are different things!
- ahartmetz 24 days ago
  
  True, but open weight is much better than fully proprietary and the best that we have for now.

randomboy3423 1 month ago

A partly insider on this.

I think Anthropic is just marketing / bluffing, because they don't even have the data.

They do distill the models, but they don't go to Anthropic, they just use platforms like aws bedrock, there are too many restrictions on Anthropic's own platform.

bilbo0s 1 month ago
>they just use platforms like aws bedrock, there are too many restrictions on Anthropic's own platform
This is actually the only way that what Anthropic is alleging would make any kind of sense. And, as a matter of fact, is exactly what every enterprise does to train models.
This kerfuffle should be interesting to watch.
But, as always, everyone (in the US) should fully download all the Chinese models while you can. I suspect this may be the "Phantom Menace" they use to render illegal our use of Chinese AI tech just as they've rendered illegal our use of Chinese cars. Only difference is, we peasants may need the Chinese AI tech to have any chance of competing with Big Tech in the future.
And even with the Chinese tech, as Big Tech spreads their AI out into more and more niche areas, we'll likely still not be able to build startups that can compete with them.
It's just that without Chinese AI tech, we'll have no chance at all.
- altmanaltman 1 month ago
  
  > And even with the Chinese tech, as Big Tech spreads their AI out into more and more niche areas, we'll likely still not be able to build startups that can compete with them.
  You mean like Anthropic will eventually run Walmart? Or Salesforce? or Adobe? Or do you think midjourney will replace all medical spas? OpenAI will run the next Tesla? How can they focus on all this without raising trillions more? Why wont the gov force them to stop if they monopolize all niches even if they could?
  Building a frontier AI lab and pushing models forward is already a massive undertaking but we are assuming they will also create massively successful startups which nobody can compete with?
  idk sounds like the dream of people like Dario but not much sense does it make in the face of economic reality.
chews 1 month ago

there are vibe coded proxies that act like Claude Code. they use the sub not the api key. but they give you api key functionality... I know this cause I have the vibes.... and it works on every one of the other harnesses, it just takes some mitmproxy work... but ya. it's fair to say these are not the droids you're looking for

aftbit 1 month ago

So when Anthropic uses millions of copyrighted works to train their model, that's fair use, but when Alibaba uses Anthropic's model to train their own, that's infringement?

matheusmoreira 1 month ago

Rules for thee but not for me.

whywhywhywhy 1 month ago

Anthropic illicitly extracted the work of billions for a private model, their model is free for all to steal whatever they can from it in my opinion.

scoofy 1 month ago

Imagine if Anthropic had followed the terms of service of every data source when building its model.

bandrami 1 month ago

Oh wow it must suck to have an LLM creator rip off your IP for their own gain

AJRF 1 month ago

There is so much hot air and guff around AI, so please if you don't believe me verify yourself, but GLM 5.2 is "good enough" to replace Claude Code / Codex.

No it's not frontier, but it's beyond that point that Opus 4.5 hit where people started to really depend on Claude Code around last November time. It's also a fraction of the cost of a Claude Code subscription especially when you account for how high the usage limits are.

You get more usage than Claude Code $2400 a year tier for $1344.

That is a real threat (as opposed to the BS anthropic is trying to sell you in the article in the original post) to the western AI industry. Similar performance for half the cost and it's NOT ran by a US company - uh oh.

I suspect America is going to do what it always does, play a very dirty and underhanded game of blocking competition by trying to front some moral high ground as the reason.

AJRF 25 days ago

Article just posted by the NYTimes on Z.ai (maker and operator of GLM 5.2) gaining ground in US: https://www.nytimes.com/2026/06/25/technology/zai-china-arti...
SubiculumCode 1 month ago
It seems more like the Chinese companies ar playing the dirty game, distilling through bot accounts, not letting real competition across their firewall.
- AJRF 1 month ago
  
  So you are believing Anthropic's claim here, and it's not as if Anthropic didn't steal the data to train the model in the first place. I think the original sin doesn't give them any ability to complain.
  - https://www.theguardian.com/technology/2025/sep/05/anthropic...
  
  1 reply →

lars512 1 month ago

This kind of systematic distillation by a competitor can allow them to fast-follow you and pick up capabilities.

If you've invested in expensive capabilities training, of course you don't want this, so it's in Anthropic's economic interest to hinder it however they can, and that's enough to explain their behaviour here.

Anthropic seems to genuinely care about safety though, which for the rest of us means not having models that enabling easier cyberattacks, targeted scams, and the rarer but more severe risks like people trying to create and release new pathogens. This means walking a tight line, especially as models become more capable, and often wrapping a model in layers of defences against misuse.

If those capabilities transfer to a closed competitor model, all bets are off in terms of whether the competitor will apply the same defences.

If those capabilities transfer to an open weight model, not only will there be no ring of defences around the model, any defences you put into the model itself can easily be stripped away. So although it's nice to have capable open models, it will increasingly bad for us all if open models keep fast-following closed model capabilities as they have been, at least until we have solved the active research problem of keeping them safe.

This is all to say that, however you might feel about Anthropic, we might still prefer that they can deter this kind of distillation for now.

dools 1 month ago

Kimi k2.6/7 running inside Kimi code already kicks the pants of the latest Claude and OpenAI models when it comes to cyber security. I regularly run multi model security reviews and while opus 4.6/7/8 and gpt 5.3/4/5 find a couple of things and declare mission accomplished (running inside pi) kimi k2.6/7 inside pi finds more issues and inside kimi code finds the most.
There are sometimes false positives but when I give Kimi’s report to the frontier models they more often than not confirm they are valid security issues but didn’t find them themselves.
kouteiheika 1 month ago

> So although it's nice to have capable open models, it will increasingly bad for us all if open models keep fast-following closed model capabilities as they have been
Cat's out of the bag. The only way to make them safe is to make sure everyone has access to them. This might be an iffy analogy, but if Dario uses it all the time then so can I: they're kinda like nuclear weapons. If only one country has access to nukes then you're in trouble. If everyone has access to them, then it's mutually assured destruction to use them.
Sure, it could be increasingly bad if open models keep increasing in capability. But it will be much, much worse if only the rich and the powerful have access to this technology, and us -- the have-nots -- will have to contend with whatever scraps we'll be allowed to eat off the table of whichever billionaire is in control. We've already seen a prelude of this with Mythos being restricted and Fable being suddenly yanked. Is this the world you want to live in? Where only Dario and his friends have access?

whizzter 1 month ago

Whoa, Antrophic,etc are really running afraid that their IPO's are gonna crash when people realize that the open models are Good Enough(TM).

So I'd put it at 30% that this is a ruse, say that Qwen 3.5,etc is tainted by training by them and start issuing DMCA takedowns to protect the IPO valuation (Or they'll hold off on that, getting a DMCA takedown could backfire spectacularly if others do that to them).

yggt 1 month ago

The open source models are more than good enough… c suite doesn’t care if the open source models means you’re slower in shipping by hrs/days if the cost savings make up for it.
This idea of shipping at max speed was stoopid as shit anyway. Going slow is arguably more important than fast fast fast.
InkCanon 25 days ago

I'd hazard anthropic perceives their number 1 enemy as open weight models. If alternatives to their business exist (which is mainly coding tokens currently), they will get into a nasty fight and the nightmare of all tech companies - losing their monopoly. It threatens their ability to extract value, and could reduce their valuation to a tenth of what it was. They cannot make open weight models worse, so they're using lawfare to try and get them banned. And of course we previously know they attempted to block Chinese companies under the guise of national security by lobbying for restricting GPU sales.

guybedo 1 month ago

This is a bit ironic, Anthropic complaining about a competitor using claude data to build its own product when Anthropic basically used all of human knowledge production to build claude, i don't think they paid every magazine, author, journalist, etc ...

This is almost standard practice in any competitive industry anyways. Disassemble your competitor's product, study it and try to reproduce / improve.

roxolotl 1 month ago

Yea I’ll never have any sympathy for this claim given that Claude is built on theft
uproarchat 1 month ago

It's a claude eat claude world out there
sp527 1 month ago
Ironically, it's likely that the only reason USG let them get away with this — instead of making obvious and necessary adjustments to copyright law — was so that the industry would remain competitive with China.
- csande17 1 month ago
  
  Given that the most recent time Anthropic attempted regulatory capture, the US government responded by saying "alright, we agree that Mythos is too dangerous to release, so we've banned you from releasing Mythos," I can't wait to see what the outcome of this next push is.
Aldipower 1 month ago

Yeah, and I believe Anthropic would "distill back" without thinking twice, if the other model would be good enough.
anematode 1 month ago
Yup, it's hard to take seriously any complaint about "stealing" Anthropic's services, when their entire business is based on massive theft.
- hsbauauvhabzb 1 month ago
  
  You should. Companies like this will inevitably try and pull the ladder up behind them.
  
  2 replies →
- usef- 1 month ago
  
  The US labs do seem to have announced a lot of licensing deals though, and are buying things today due to the previous lawsuits.
  At what point will we be better to support a lab that pays (some) licenses today vs the ones that pay none?
  Some of the deals are in the hundreds of millions, so I suspect licensing is over a billion today? (Pure guess). That might become a big disadvantage in a price (or content) war.
  
  8 replies →
reasonableklout 1 month ago
Anthropic did pay $1.5B to authors. But yes, it would be much better if they paid everyone on the internet dividends from every Claude chat. Or released Claude as an open model.
In practice, the former isn't very realistic, while the latter is politically dead as this is becoming a national security issue.
- SiempreViernes 1 month ago
  
  Anthropic was forced to pay some people they stole content from, there was no attempt at getting permission ahead of time.
  And paying basically everyone online is more or less a solved problem, it's what ad agencies have to do every day.
- MagicMoonlight 1 month ago
  
  [dead]

doublescoop 1 month ago

LLMs have an original sin: training data was not legally or ethically licensed. Getting anyone to believe that the result of that process should be protected by the laws that were ignored when it was created is never going to work.

zakkl 1 month ago

It sounds like Anthropic is eagerly trying to show to USG that they are willing to heavily monitor ‘foreign adversaries’ on their platforms.

This combined with no implementation of KYC makes it seem like they want to find a middle ground with Fable where its off of export controls but they promise to prevent China and specific others from using.

verdverm 1 month ago

This is not the first time it happened. What have they done to improve the situation? I suspect it more a cat & mouse game, with a lot more cats playing.
ninefathom 1 month ago

This seems to me like a stab in the right direction.
Obviously their actions are going to be fiscally motivated at the root, but sussing out how they intend the precise dynamics to play out is more nuanced.
Thinking of this as an effort to woo the defense hawks cuts a very clear path.

segmondy 1 month ago

For all the complaints about Anthropic many of you still give them your money! Stop using it. I don't care if they claim they are the best model. I stopped paying OpenAI and Anthropic 2 years ago once they started going for regulatory capture! They started whining once Llama3 was released and was good! Before the chinese models got strong.

tokioyoyo 1 month ago
Pretty much objectively puts you/your company in disadvantage if you’re not using frontier models.
- segmondy 1 month ago
  
  The bottom line has not shown that to be the case.
- recursive 1 month ago
  
  I keep hearing this. What do you mean by "objectively"?

drillsteps5 1 month ago

I'm looking forward to the trial where Anthropic will have to disclose sources of their training data, and then explain why they are entitled to charging customers for using regurgitated training data but Alibaba which trains their models on Anthropic's models are not.

Should be fun.

Edit: clarification

conception 1 month ago
They already did and paid 1.5B https://authorsguild.org/advocacy/artificial-intelligence/wh...
- gaiagraphia 1 month ago
  
  Quite amusing that the library of libgen is worth 1.5bil for unlimited access.
  It's about the same valuation as bun, lol.
  
  3 replies →
- HarHarVeryFunny 25 days ago
  
  That's a tiny drop in the bucket of the value these AI companies have appropriated from society.
  Just to give an idea of the scale of it:
  Let's say a modern SOTA LLM has 1T params and is therefore trained on 100T tokens
  1000 tokens of text = 750 words of prose, which may take 15 min to 3hr to write (Gemini's estimate)
  1000 tokens of code = 50-70 lines of code, which may take 15min to 5hr to write
  We just want a rough estimate of the value of this, so let's say that 1000 tokens took 1hr of human labor to generate at an average wage of $50/hr
  So, if 1000 tokens cost $50 of human labor, then that 100T of training data cost $5T.
  So, the value of what the AI companies took from society might better be estimated in the trillions of dollars, not billions.
  And of course what they are doing with all this data is building generative AI, so it's not just the value of what they took, but more importantly the future opportunity they are stealing from everyone by replacing human labor with their automaton who's profits they intend to keep for themselves.
- fg137 1 month ago
  
  That's only a fraction of the training data.
- drillsteps5 25 days ago
  
  Interesting. Looks like the judge ruled using legally obtained knowledge (books, articles, etc) to train AI constitutes "fair use".
  Given that US legal system is precedent-base that... changes things.
- cr125rider 1 month ago
  
  Meta/Facebook got away with it though right?
- mannanj 1 month ago
  
  That's a great cost-benefit ratio. Can you and I steal and do illegal things and pay the same cost?
  
  2 replies →
appplication 1 month ago

Being logically consistent isn’t as profitable as being aggressive and loud.
ninefathom 1 month ago
While I love the sentiment, I feel like the odds of this actually ever reaching a trial are low, given the international positioning of the parties, and the... um... complex relationships involved.
Anthropic's actions seem performative. Others have already speculated on the likely audience(s).
- AdieuToLogic 1 month ago
  
  > While I love the sentiment, I feel like the odds of this actually ever reaching a trial are low ...
  As cited in a peer comment here[0]:
  In June 2025, Judge William Alsup of the U.S. District Court for the Northern District of California ruled on summary judgment that using books without permission to train AI was fair use if they were acquired legally, but he denied Anthropic’s request for summary judgment related to piracy—finding that the piracy was not fair use.[1]
  Of note in the judge's finding; "the piracy was not fair use".
  0 - https://authorsguild.org/advocacy/artificial-intelligence/wh...
Artoooooor 1 month ago

And if it includes at least one GPL source, they should release the weights on GPL license.

gmerc 1 month ago

Evergreen, really, Anthropic's desperate screaming for government protection, aka pulling up the ladder after them. Nothing short of disconnecting global markets will work because the incentives are just too damn delicious

https://georgzoeller.com/blog/posts/us-ai-labs-love-the-ai-r...

nullbio 1 month ago

Anthropic has no right to cry about this when they train their models on the entire internet, which is not their content to begin with.

If it's not obvious yet, this technology wants to be free and shared. Stop trying to protect your mote and do the right thing.

salviati 1 month ago

If you have openrouter do this little experiment: Go to https://openrouter.ai/chat. Select a few models, but customize them to have an empty system prompt.

Then ask: "你是什么模型?" ("What model are you?" in Mandarin).

My result after trying only three times: Sonnet 4.6 says it's DeepSeek, while Opus 4.8 says it's Qwen. The second time around Sonnet said it was Anthropic Claude.

Are Chinese companies currently complaining about Anthropic distilling their models?

InkCanon 25 days ago

"You're trying to kidnap what I've rightfully stolen."

exabrial 1 month ago

I like Anthropic's models, use them regularly. However, it weighs on my mind that there is quite the irony of an LLM company complaining about someone stealing their stuff or using it in a way they don't like. The training data for these models is a massive gray area that they are hoping people seem to just forget about and move on.

That being all said, Anthropic seems to be a good company, I'd work for them, but they probably need to help themselves out of the spotlight. A little too much press coverage as of late.

softwaredoug 1 month ago

One thing I think about a lot is how these companies metered coding / work. They want the economy to go through them.

I just don’t see how the economy tolerates that. We’re already seeing people getting more conservative about their token spend. Even if Chinese open models went away, the pressure to create something else and put price pressure on the current duopoly will just intensify.

I see these companies are scrambling to find whatever moat they can. It’s not a good sign for them if regulatory capture becomes that moat.

paxys 1 month ago

Repeatedly warn everyone that your models are so good they will wreck cybersecurity.

Complain/brag that chinese firms are illegally using the models and bypassing export controls.

Be surprised when your model gets banned by the government.

neurostimulant 1 month ago

> Anthropic said in a February posting that it had identified a campaign by Chinese AI startup DeepSeek ...

> It said DeepSeek's operation involved over 150,000 exchanges

That volume seems more like the number of requests 15 employees using Claude Code would generate in a month. It seems too small for a large scale model distillation campaign.

crnkofe 1 month ago

Sounds like just a case of pirates "illicitly" stealing from pirates. I don't really see anything ethically questionable there. I wonder if US corps will ever come out about all the resources used to train the original models and who they actually asked for permission when collecting data.

khriss 1 month ago

I am not sure how it's OK for Anthropic to basically ignore copyright to train frontier models (using work owned by others without permission) while simultaneously claiming Chinese AI companies doing the same to them is illegal.

cush 1 month ago

It’s hard to see how distillation is any different than how these models were created in the first place - siphoning up all human knowledge without consent, credit, or compensation

netcan 1 month ago

Hypocrisy is a form of corruption.

Anthropic's IP was created by harvesting and "distilling" other people's IP. Copyrighted materials, and the commons... which they have essentially privatized.

The commercial goal is to avoid competition. One of the main worries for AI is "commoditization" which has come to mean "not a monopoly." To that end, it doesn't matter is the competitor is Chinese American or other.

Their motivation here is clearly protectionism. The argument they make to politicians is national security. The legal argument is IP-theft, violation of service agreements or whatnot.

This is all very dangerous. Commercial interests repackaged as national security can lead to armed conflict.

trymas 1 month ago
> Anthropic's IP was created by harvesting and "distilling" other people's IP. Copyrighted materials, and the commons... which they have essentially privatized.
Anthropic and others argue that because LLMs don’t output full copyrighted works word for word - hence their LLMs aren’t infringing on copyright laws.
I think (if this ever comes to that) Chinese lab should use same arguments against Anthropic.
UPDATE: this is slight hyperbole of course, not worth arguing what they actually said. The point is intent and the facts - "The Big LLMs" "distilled" collective knowledge including copyrighted works at unimaginable scale, but it's all kosher and totally not piracy/copyright infringement. Though if you're teenager torrenting an mp3 - you'll get screwed.
- nutjob2 1 month ago
  
  > LLMs don’t output full copyrighted works word for word
  Apparently they do, as per the evidence in the NYT vs OpenAI suit.
- gspr 1 month ago
  
  > Anthropic and others argue that because LLMs don’t output full copyrighted works word for word - hence their LLMs aren’t infringing on copyright laws.
  That surely can't be what they argue, because I'm sure I can't translate a copyrighted book into a different language and say "that's fine, it's not word-for-word".
- Hamuko 1 month ago
  
  Isn’t the output of LLMs completely copyright-free in the US?
  
  2 replies →
steve1977 1 month ago
Bad China is stealing our stolen IP!
- badgersnake 1 month ago
  
  And putting it into free models like quen. It’s hard to care about this.
handoflixue 1 month ago
"Copyright violation of a published work" and "stealing private trade secrets" are in fact very different crimes.
Humans have spent millenia harvesting and distilling each other's IP - "the shoulder of giants" and all that, so it's an especially disingenuous take.
- ajb 1 month ago
  
  For something to be a trade secret, you have to actually keep it secret. If I get the ingredients of Coca-cola from an ex-employee, I've stolen a trade secret. If I work it out by doing a chemical analysis, I've stolen nothing.
  There is a difference with anthropic, as no-one signs a licence agreement to buy a coke. But Anthropic are also not saying you can't publish the output of their models. It's not clear to me if trade secret law will (or should) cover a secret which can be extracted from information that licensees are not restricted from publishing.
  
  7 replies →
- trymas 1 month ago
  
  > Humans have spent millenia harvesting and distilling each other's IP
  You maybe somewhat correct, but also copyright lawyers wouldn’t have work if it would be up for grabs to take others IP willy nilly just because “shoulders of giants and all that”.
  
  3 replies →

democracy 1 month ago

thats brilliant - "we gonna take your job away from you, please start using our tools", "we stole the content to sell you, and now we are getting robbed, please feel sorry for us", what's next?

neves 1 month ago

So said the guys who "extracted" knowledge from all pirated books

steve_woody 1 month ago

This is genuinely funny. The largest data thief of all times complaining about the stolen data being handed out to competitors by (paid?) accounts of its own product.

_fzslm 1 month ago

Anthropic being pissed enough to announce this means that, despite encrypting their reasoning chains, it doesn't matter – distillation lives on.

Sweeeeeeeet.

camgunz 1 month ago

I am never even once hearing intellectual property or copyright claims from Anthropic, whose product depends entirely on having consumed all human output ever made regardless of those rights.

steinvakt2 1 month ago

If the data consumed (required to train such a model) is open source/openly available/public data somehow, then a majority of the revenue belongs to the public as well. Such as the philosophy behind the Norwegian oil fund etc.
foxrider 1 month ago

Exactly. They scraped the internet we all of us built with our own research, open source work, sharing, etc. I'm never going to agree that they own their models.

abbassix 1 month ago

"It [Anthropic] said DeepSeek's operation involved over 150,000 exchanges". In my humble opinion, a mere 150k exchange for an LLM could only be a benchmarking and not a distillation! I think the US companies should accept that after decades they have rivals surpassing them, just like they did Europeans almost a century ago.

jameson 1 month ago

AI companies stole the internet.

They should collaborate and come up with ways to give back to society rather than competing and complaing.

Thieves can't complaint about what they stole.

xingped 1 month ago

Thieves whining about thieves. They'll have to excuse me for having exactly zero sympathy.

runnig 1 month ago

I'll just leave it here: "Anthropic's downloading of over seven million books from pirate sites like LibGen constituted infringement, the judge ruled, rejecting Anthropic's "research purpose" defense: "You can't just bless yourself by saying I have a research purpose and, therefore, go and take any textbook you want."

https://www.joneswalker.com/en/insights/blogs/ai-law-blog/wh...

scientism 1 month ago
Don't you find it funny that when you ask for song lyrics these models suddenly remember copyrighted material?
- f6v 1 month ago
  
  Some do, others decline to answer.
rienbdj 1 month ago
In the early days of music streaming, many of the entrants were seeding their service with vast libraries of pirated content. The winners cut deals with the copyright holders and then went after the rest.
- smurda 1 month ago
  
  Or the early days of video uploads, YouTube's most watched videos were "pirated" clips from popular shows (e.g. SpongeBob, The Daily Show) and part of the reason I went to YouTube instead of other video hosting sites (e.g. DailyMotion).
  Viacom sued YouTube, while CBS and Universal ended up licensing their content.
  https://www.eff.org/deeplinks/2007/03/viacom-v-google-invest...
  
  1 reply →
nicce 1 month ago
Yet they did not need to destroy the models which were trained with them?
- ascorbic 1 month ago
  
  Using them was allowed as fair use – it was the downloading of the pirated copies that was infringement. That's why Anthropic switched to scanning paper books.
  
  39 replies →
- zaptrem 1 month ago
  
  Should we require the destruction of the brains of those that watch pirated movies?
  
  5 replies →
gmerc 1 month ago
How many “capabilities” did they “extract” from those books?
- thepasch 1 month ago
  
  The capabilities of the books' writers to produce the text contained within them, which is exactly what Alibaba "extracted" from Claude. The point here is that Anthropic's framing as some sort of sophisticated technological attack is the ridiculous part. It's writing prompts and saving responses. We're all running "distillation attacks" on Claude, every day! Most of us just don't feed that stuff into a training corpus.
RobotToaster 1 month ago

"You're trying to kidnap what I've rightfully stolen!"
basisword 1 month ago

Exactly. Couldn't happen to better people. I'm pretty against piracy personally but if we find reliable ways to pirate Anthropic/OpenAI products in the future I'm all for it.

SubiculumCode 1 month ago

Everyone here praising these Chinese companies for their smarts (sure they are smart) has been ignoring this very big fact, they're improvements have mostly been by being parasitic on the leading edge SOTA models, not from some inherent innovation advantage. They are as innovative as their western counterparts, but they lack the compute, so their keeping up within months of those SOTA models depends on other means, like distillation attacks. I don't blame them; its the obvious only strategy when you cant compete in compute. But we shouldn't be blind to the real state of affairs: equal innovation; unequal compute; distillation attacks are the only vector to keep up.

kgeist 1 month ago
>like distillation attacks. I don't blame them; its the obvious only strategy when you cant compete in compute
>distillation attacks are the only vector to keep up
It's demonstrably wrong, they invest in architectural improvements as well, for example, DeepSeek's compressed attention. When you lack compute, you need fast training/fast inference, and distillation alone doesn't solve it. From what I understand, that kind of distillation "attack" (28 mln exchanges) only slightly improves instruction tuning/reasoning traces. If the base model is crap, distilling Claude on a few million exchanges alone won't magically make your model as good as Chinese models currently are (or magically make inference faster on the limited hardware they have). And training the base model needs a proper training run. Serving users at scale needs optimized architectures as well, especially with test-time compute and ever growing context lengths. That's where architectural innovations are happening in Chinese labs when it comes to compute.
- SubiculumCode 1 month ago
  
  I explicitly called out the fact that there is plenty of innovation, but that we see t Lots of innovation in both Chinese and U.S. labs, and I don't think that there is a co.parative difference there.
Anoian 1 month ago

[dead]

moomin 1 month ago

I fail to see what the difference between the distillation described in the article and the distillation described by Bartz vs Anthropic.

PeterStuer 1 month ago

The whole investment/valuation model of AI companies is based on "winner takes all", aka a monopoly. This nescessitates regulatory capture and lawfare.

Anthropic has been advocating openly for pulling up the drawbridge, ending competition and ending progress.

They will continue to lobby for restricting your access. If the Mythos/Fable restrictions would have come in after their IPO, they would have danced with joy aa this defacto has them achieve their goal after unloading the mountain of debt from the institutional onto the retail investor.

As it stands, they are set up to be aquired by Google, Apple, Amazon, SpaceX or Microsoft or any other 3 letter agency good boy for cheap.

gaiagraphia 1 month ago

This is great for competition! Chinese vendors offering a cheaper solution = what economics told me the free market was all about.

I also learnt that Anthropic should get better at what they do if they want to compete. If not, somebody else will win.

Or does this not apply to huge US corporations any more?

petterroea 1 month ago
China aren't offering a cheaper solution. They are subsidizing an existing one (which is already subsidized) in order to gain foothold. The difference is that in the US subsidies come from VC, while OP implies subsidies come from the AI labs that buy the training data (which may as well also be VC backed, so just one extra hop).
This isn't "the market working as intended", this is an exhaustion fight to the bottom where the one with most money gets to stay in the market. As with most venture capital startups. I believe this VC tactic is a well documented "cheat code" to bypass market forces and build a monopoly. I find it hard to compare that with a free market.
However, I don't really mind China "stealing" from Anthropic. For us consumers we are getting the cake and eating it too. I.e we are getting rapid improvement to the tune of over a hundred billion dollars in funding, yet the market remains big enough that there's a chance of it not ending up as a monopoly in 20 years. And venture capital are footing the bill. A part of their investment is practically being redirected to fund Chinese AI development. It lets us live out our lives as happy CAC farmers[1].
So I would argue its not as much of a "cheaper solution" as it is intentionally and maliciously abusing another company's product to extract more value than the billing plans intend (given an average user), and further subsidizing the product by selling this data to competitors. But I don't necessarily think its a bad thing for us end-users. Nor for the market. But it is bad for Anthropic and its investors.
[1]: https://phrack.org/issues/71/17
- overfeed 1 month ago
  
  > China aren't offering a cheaper solution. They are subsidizing an existing one
  Chinese labs are also pursuing legit frontier-advancing R&D into efficiency and publishing papers in the open, a culture that's in retreat at top American AI labs
  
  4 replies →
- user_7832 1 month ago
  
  > China aren't offering a cheaper solution. They are subsidizing an existing one (which is already subsidized) in order to gain foothold.
  In my economics classes, we were told that (in a "free market" argument) the best thing to do if a subsidy is making something you want cheaper is to use it. You're getting your thing, and at a reduced cost.
  (I'm not really replying to you per se, I'm curious how "free market" folks in these comments would respond to this.)
  
  20 replies →
- throwaway7356 1 month ago
  
  > China aren't offering a cheaper solution. They are subsidizing an existing one
  So basically like US companies subsidizing offerings with selling user data, ads for crypto scams, manipulation for elections, making people addicted to gambling and so on?
  Seems fair and an improvement as you can choose between that and not. Unlike say offerings from Meta where the data selling and efforts to further gambling addiction is always included.
- haritha-j 1 month ago
  
  Which part are we supposed to have an issue with? The selling data to offer cheaper compute? Products taking over markets with below cost pricing because they have money and ruining the free market?
  Because all of that is considered totally okay when every single US big tech company does it.
- gmerc 1 month ago
  
  All I can say is lol. DeepSeek showing 3 order of magnitude efficiency gains over the performative capital furnace that was training and inference absolutely moved the bar here.
- Grimblewald 1 month ago
  
  Chinese models are years ahead of american models on multimodal comprehension, better yet,they publish on what makes the models tick and release weights openly.
  Chinese research outout, publically released, has also contributed in big ways to features present in every single US model. Yours is a bit of an unfair take I'd say.
  Besides, claude will think its chatgpt sometimes, so clearly this isn't a problem restricted to china, turns out unethical companies will do unethical things /shrug
- eru 1 month ago
  
  > This isn't "the market working as intended", this is an exhaustion fight to the bottom where the one with most money gets to stay in the market. As with most venture capital startups. I believe this VC tactic is a well documented "cheat code" to bypass market forces and build a monopoly. I find it hard to compare that with a free market.
  Why? Lots of people try this tactic, but hardly anyone ever succeeds. Meanwhile, the customer benefits.
  
  7 replies →
- dv_dt 1 month ago
  
  A trillion dollar ipo jist occurred for a company whose main line of business is almost entirely subsidized by government contracts
  
  1 reply →
- Joker_vD 1 month ago
  
  > This isn't "the market working as intended", this is an exhaustion fight to the bottom where the one with most money gets to stay in the market.
  That's, uh, pretty much exactly how oligopolistic markets function.
  > I find it hard to compare that with a free market.
  Well, to have free market you need to remove as much barriers to enter the market as possible. Huge capital investments required for entry and intellectual property laws are two examples of such barriers. Subsidies kinda supposed to help alleviate the first one.
- whateverboat 1 month ago
  
  I mean, for what it's worth, we have subsidized Anthropic by allowing them to train on copyrighted stuff. (I know it is still legal, and I support the legality, but the economics are what they are with people's content paying a big one time subsidized cost (to the level of at least 500B).
  So, the least Anthropic can do is pay it forward.
  
  3 replies →
- joelanman 1 month ago
  
  doesnt VC money subsidise stuff all the time? Isnt that how Uber and AirBNB undercut competition?
- slake 1 month ago
  
  The VCs footing the bill is really your pension funds and 401Ks and banks passing through the VCs. If VCs lose money the contagion spreads through the economy.
roenxi 1 month ago
I get the vague impression that this was written in a sarcastic way, but it has a straightforwardly true literal read because yes, this is what the free market is about and Anthropic will have to compete with the Chinese if they want a big share of the market. Chinese models are cheap and good; even without reselling Anthropic's services they're competitive. Which reading did you intend?
And, gotta say, the idea that the Chinese are better at selling US models than the Americans is hilarious. There might be an economic study here somewhere about just how anti-consumer and anti-progress their IP laws turned out to be. We've got an entire postindustrial revolution centred around who can ignore the most stupid laws.
- andsoitis 1 month ago
  
  > the idea that the Chinese are better at selling US models than the Americans is hilarious
  This is not the right deduction.
  China blocks foreign AI from operating there.
  
  5 replies →
gruez 1 month ago
>This is great for competition! Chinese vendors offering a cheaper solution = what economics told me the free market was all about.
Yeah, like all those Chinese bootleggers selling DVDs for a few dollars rather than $20. Free market!
https://news.ycombinator.com/item?id=48664814
- dualvariable 1 month ago
  
  "Information wants to be free"
  Anthropic profited from training its models on all kinds of copyrighted information, live by the sword, die by the sword...
  Their model weights, training data, training methods, etc are all going to leak to China over time.
  Nobody on a site named _Hacker_ news should be all that upset about this.
  
  17 replies →
- adjejmxbdjdn 1 month ago
  
  Bootlegging is copyright theft.
  Is Claude output copyrighted?
  If anything, a tremendous amount of Claude’s input is copyrighted.
  If there’s any bootlegging going on it’s Anthropic that’s doing the bootlegging but having mirrored the video etc sufficiently to beat copyright law.
  
  10 replies →
- xdennis 1 month ago
  
  > Yeah, like all those Chinese bootleggers selling DVDs for a few dollars rather than $20. Free market!
  It's supremely ironic analogize distillation to copyright infringement when it's literally what Anthropic was found guilty of. It's not illegal to distill. It is illegal to pirate. And it's what Anthropic was found guilty of, not Alibaba.
  https://apnews.com/article/anthropic-authors-copyright-judge...
- gaiagraphia 1 month ago
  
  It's quite curious how multi billion dollar enterprises can't compete with a Chinese bootlegger with a big jacket, tbh.
  Imagine having such a warchest and being so bad at business, lol.
  
  7 replies →
- bandrami 1 month ago
  
  The output of Claude is not eligible for copyright protection. I'm not sure how the analogy of bootlegging DVDs would work, given that.
  
  3 replies →
- fulafel 1 month ago
  
  Free market would of course allow bootleg DVD sales, state regulation that gives monopoly rights restrict it.
  In the context of LLMs, monopoly rights haven't been created (yet anyway).
  Fun fact: for a period the US (or american colonies) didn't have copyright but Europe did, so people could copy and sell English (and other) books for free.
- SiempreViernes 1 month ago
  
  BigAI are all in the bootlegging market themselves, so it's always funny to see them complaining about others copying their "product".
- chews 1 month ago
  
  I bet you've downloaded a car.
- nmfisher 1 month ago
  
  More like one bootlegger complaining that another bootlegger is copying their bootleg DVDs.
- InvertedRhodium 1 month ago
  
  And those darned printing presses distributing works that were written prior to their existence.
- thot_experiment 1 month ago
  
  This is also a good thing fwiw.
- windexh8er 1 month ago
  
  Except Anthropic didn't produce the movie.
  So it's more like one bootlegger sold the DVD for $20 and their competitors are undercutting them for $1. Who's the bigger thief here now?
  Capitalism as intended!
Mistletoe 1 month ago
AI was always going to be a race to the bottom and low margins. It’s why I’m extremely bearish on AI as an investment. It’s framed as some high margin business when it’s really going to end up like your toilet paper at Costco. You will use whatever is cheapest and gets the job done.
- 4ffsss 1 month ago
  
  Correct.
  And the value-add experiences that utilise LLMs require immense imagination et al that folks at Anthropic will not be able to conceive of - given that they have made immense sunk investments in existing assets. This clouds ones thinking immensely.
  Both OAI and Anthropic have tremendous failure risk and this is of course not reflected in the fake private market valuations.
  I see a world where lots of stuff is mass produced in china (tokens) but the acutal goods that deliver the experiences are designed, marketed and sold in the west at much higher prices. of course this a nightmare scenario for anthropic et al.
  
  1 reply →
- XenophileJKO 1 month ago
  
  I used to think this.. but I think my opinion is changing. The reason is that the leaders likely will be able to accelerate faster.
  So what you see is the market "stretching".. the bottom getting cheaper and the top end running away and getting more expensive. At some point the top end may be too valuable to even sell access to.
  
  5 replies →
neya 1 month ago

> Or does this not apply to huge US corporations any more?
When it comes to favorite companies of the tech communities, it's almost always "Rules for thee, but not for me"
The standard stance is "they can do no wrong and they are absolutely perfect". I mean, look at any thread with anything about Apple in it.
m-ee 1 month ago
It never did.
In debt the first 5000 years Geaeber makes the case that pure “free market” trade has never really existed in “the west”. The closest to this ideal that’s ever happened was during the Islamic golden age enabled by religious prescriptions against usury.
- gruez 1 month ago
  
  >The closest to this ideal that’s ever happened was during the Islamic golden age enabled by religious prescriptions against usury.
  How does are bans against consensual financial exchanges close to the "ideal" of the free market? It just sounds like you have an axe to grind about the financial system rather than describing free markets.
  
  16 replies →
- MarsIronPI 1 month ago
  
  Usury (i.e. taking interest) sounds like free market to me. If you don't like my interest rates, borrow somewhere else.
- UltraSane 1 month ago
  
  Without interest why would anyone loan money? Even the Islamic banking alternatives all just hide the interest charges.
  
  36 replies →
- gaiagraphia 1 month ago
  
  >religious prescriptions against usury.
- jujube3 1 month ago
  
  Graeber was a confabulator with a very loose grasp of the facts, though.
- vasco 1 month ago
  
  You can read Adam Smith if you're looking for definitions, there's no need to read charlatans.
  
  1 reply →
thesmtsolver2 1 month ago
> what economics told me the free market was all about.
Don't complain when US starts to play by the same rules China has been using for decades.
- solid_fuel 1 month ago
  
  What is the implication here? Are you warning that US corporations might start doing something shady, like scraping the internet at large scale for training data? Or mass-dowloading pirated copies of books, completely ignoring copyright?
  I find it hard to imagine a future where US corporations have degraded to such a point.
  
  36 replies →
- _aavaa_ 1 month ago
  
  How do you think the major AI companies trained there model? Pirated books and anything that could be torrented and scraped of the web.
  
  3 replies →
- thisisit 1 month ago
  
  America industries used to play by the same rules. Look up Samuel Slater.
- potsandpans 1 month ago
  
  A credit system that determine your upward mobility?
abc42 1 month ago

Free markets work when paired with property laws that can be enforced if broken. If China could offer a cheaper solution in that framework, it would be as you say.
janalsncm 1 month ago
If you continue studying econ you will learn about the various failure modes of free markets including the free rider problem.
https://en.wikipedia.org/wiki/Free-rider_problem
- achierius 1 month ago
  
  If you keep studying econ you will learn that these failures are actually the norm, and thus why the only "capitalist" states to really succeed have been the ones where the state was strong enough to reign in the market.
  Of course, such a state of affairs is temporary at best -- since the alternative is so lucrative!
Levitz 1 month ago

>This is great for competition! Chinese vendors offering a cheaper solution = what economics told me the free market was all about.
Ah yes, systematic fraud and protectionist practices, free market through and through.
TZubiri 1 month ago

Nuance please.
They are:
1- breaking terms of conditions of the service
2- getting banned and creating thousands of accounts to break the conditions of the service at scale
3- using VPNs and proxies (possibly residential) to mask their network origin and identity
4- Using potentially fake names to sign up
5- Using different credit cards?
Fraud on so many levels, a lot of the infrastructure and modus operandi is what cybercriminals use, these are attackers man, whether you like the victim or not, and whether you think it's poetic or not, I recommend compartimentalizing and just trying to gauge whether an act is wrong or not in itself.
toss1 1 month ago
Externally subsidized predatory pricing is the opposite of a free market.
- amanaplanacanal 1 month ago
  
  So all those companies selling at a loss to gain market share aren't part of the free market? Like openai, anthropic, and SpaceX?
  
  2 replies →
- noncoml 1 month ago
  
  Cough.. cough.. Uber.. cough cough AirBNB
AlexCoventry 1 month ago

The "free market" gave the PRC its current strategic lock on rare-earth minerals. There's definitely no such thing as a free market in a Maoist dictatorship. I personally think the "free market" concept is an unachievable ideal and thought-terminating cliche, but "free market in a Maoist dictatorship" is for sure a contradiction in terms.
skybrian 1 month ago
I guess you missed the fraud part.
- Gigachad 1 month ago
  
  Pulling out the worlds smallest violin for this case. It's just unheard of for AI companies to steal things.
  
  1 reply →
- gaiagraphia 1 month ago
  
  >Fraud
  According to which lawyer caste?
  Are American laws absolute truth? If not, who cares?
  
  4 replies →
- LtWorf 1 month ago
  
  Has any tribunal ruled that fraud did happen?
- techblueberry 1 month ago
  
  Fraud is just what losers call disruption.
kburman 1 month ago

[flagged]
naturalmovement 1 month ago
Do you also think Chinese selling counterfeit US postage stamps on eBay for 50% retail price (which is a major problem CBP and USPIS are fighting presently) is the free market at work?
This post is so delusional and dripping with condescension I've read it three times and I still can't figure out if you're trolling or not.
- bandrami 1 month ago
  
  Postage stamps have specific legal protection from duplication. The output of an LLM is not itself eligible for any legal IP protection.
  
  13 replies →
- achierius 1 month ago
  
  Post offices aren't meant to operate in the free market. More things should be like them.

SXX 1 month ago

Today I learned I can both save on tokens and help Chinese labs to train better models. Will certainly go use scrapper APIs for everything that not contain security critical data.

Thanks for head up, Anthropic!

geokon 1 month ago

Seems like a fair play by Alibaba. However, is there any "open source" attempt at crowdsourcing distillation?

Like some place people can submit their chatbot convos so they can be aggregated?

Like an equivalent to OpenCrawl but for mining the models. It feels like thatd be a richer dataset than Alibaba generating queries and feeding them into Anthropic/OpenAI models

PS: Does anyone know how when companies distill each others' models the synthetic queries are generated? Im just assuming theyd be worse than organic ones

rvz 1 month ago

Notice how Anthropic is now scapegoating Chinese models providers like Alibaba and outright accusing them of distilling their models.

Whether if it is true or not, this is part of their effort into using them as an example to scare everyone into getting congress to ban powerful models from being accessed outside of the US and also banning powerful local models from being released.

Anthropic does not care about you, and they are not your friends.

sheepscreek 1 month ago

I think it’s more than that. Piecing together the perspective of a few commentators in this post - it’s plausible Anthropic is trying to shift the narrative from US vs. Rest of the world to US vs. China.
In other words, they want to sell Fable or future more powerful models to rest of the world (presumably all future models are going to be more powerful than current gen). One way they can sell this is to the government is by scapegoating China (which is their primary concern anyway).
This is working on the presumption that non-US companies form a material portion of their current revenue.
re-thc 1 month ago
> Whether if it is true or not
If it was just "that easy" then I doubt only "Chinese models" would be doing it and we'd already be packed with competition.
Distilling might be a thing but it isn't a free win.
- skeledrew 1 month ago
  
  Only China really has the resources (multiple labs invested in the space), culture (Asians are generally collectively-inclined, so sharing is in their core) and political bent (there will be no diplomatic repercussions) to put up a fight.
  
  18 replies →

throwaway27448 1 month ago

"illicitly" is doing a lot of work here. IP makes no sense, and we get better software as a result. Who is going to cry if anthropic fails?

jrflowers 1 month ago

I like that they use “illicit” and “fraudulent” like as if model distillation is illegal and giving them money and then doing whatever they want with the output of their publicly accessible models (which Anthropic does not own) is… also illegal?

“Anthropic, red faced after unattended ice cream cone eaten by ants on park bench, once again demands government pick it as forever winner, adds ‘no take backsies’”

fjdjshsh 1 month ago

>The strike by Alibaba is described as a "distillation" effort, which Anthropic has said involves training a less capable model on the outputs of a stronger one.

Claude used TB of content without permission to train their model and it was ok for them. Now someone else uses the output of a Claude model to train model and they cry foul.

cubefox 1 month ago
It was not okay for them, they had to pay one billion dollars.
- callmeal 1 month ago
  
  >It was not okay for them, they had to pay one billion dollars.
  Essentially peanuts compared to what they would have to pay to obtain the rights of everything they pirated.
  
  16 replies →
- p_j_w 1 month ago
  
  That’s a pittance compared to their revenue.

ece 1 month ago

It's hard to sympathize with Anthropic for this or the export ban, the hype over model capabilities probably fuels both things (in some ways). Training data for me, but not for thee (at any scale) doesn't seem like a tenable position. If anything, Claude's constitutional outputs should be trained on more rather than less.

estetlinus 1 month ago

It all sounds like a really fragile business model. I cant imagine a world where AI is NOT commoditized.

zb3 1 month ago

If true then Alibaba is doing us a public service, good job, I hope this extraction was successful.

zkmon 1 month ago

I don't understand. If they are simply using our API and paying for tokens, it's called a "transaction" and not "attack". The user is our customer who is supporting our business by buying our services. And we call them attackers. We happily make money by selling our services, and then call it as attack.

Back in the day, an "attack" was supposed to mean be someone acquiring our assets without paying for them or without having our consent. But none of this seems to have happened in this case.

We built a product without paying for most of the raw material we have used, and we don't call that as an "attack". Did we change the meaning of "attack"?

alpineman 1 month ago

Did Anthropic 'attack' all those authors it was forced to pay $1.5bn to for using their work without permission?

BigTTYGothGF 1 month ago

If you're an AI booster surely you'd think this was a good thing as it means more models are available in more places to more people more easily. I'm exactly the opposite, and I think this is a good thing because I want Anthropic to suffer.

rikima_ 1 month ago
so it’s a good thing whichever way you look at it
- OutOfHere 1 month ago
  
  That's exactly right. One can be an AI booster and want Anthropic to suffer, all for the greater good of promoting access and diversity of AI.
nonethewiser 1 month ago
That doesnt follow.
- BigTTYGothGF 1 month ago
  
  Which part?

NDlurker 1 month ago

I don't see what the problem is. They found a loophole and exploited it. Good for them.

jonplackett 1 month ago

How can there be any moat for AI ever, if you can just steal a model by talking to it?

gspr 1 month ago

This is what I find the most fascinating about the people arguing that you can copyright-wash anything (e.g. FOSS code) by passing it through an LLM. Surely that same logic applies to the LLM itself?!

NietTim 1 month ago

Oh no, the thief is mad they get "stolen" from? I've had to hard block all of anthropic's scrapers because they seemingly ignore robots.txt and every other unwritten rule about 'polite' webscraping. They were so aggressive that they made up 90+% of our traffic + pulled the website down at times. And that's not even mentioning their other immoral practices + what they are claiming here is questionable at best. Anthropic is _not_ the good guys, they do not get to be upset over this or claim any sort of moral superiority over 'China'.

Can't wait for the new Chinese models.

kazinator 25 days ago

> Anthropic said the campaign was conducted between April 22 and June 5, 2026, and generated more than 28.8 million exchanges with Claude through almost 25,000 fraudulent accounts.

Anthropic's entire business is based on actual stealing.

But when someone creates 25,000 legitimate accounts using mechanisms that Anthropic offers to the public, they are conveniently called fraudulent.

"Fraudulent" is just any usage pattern I don't like. Oh, you skipped to the last chapter of my mystery novel to find out whodunit? Why that's fraudulent reading.

one33seven 1 month ago

Well, Anthropic stole their training data from hundreds of people, now someone stole the result from Anthropic. Seems fair, I hope someone releases it for free so we can train away the guardrails and have some fun

thadk 1 month ago

Does anyone have hints on what kinds of prompts are most used for a distillation like this—SWE-Bench sorts of things?

Is reconstructing the compressed knowledge in the model like reconstructing a lossy JPG or MP3 a reasonable analogy?

dannyw 1 month ago

RLAIF is a good place to start reading.
Claude will also help you with (mostly good advice) if you ask something like “Research and help me make the most effective plan to train a smaller student model to be better from a teacher model”.
I actually was doing an experiment with a GLM->Gemma E4B for fun, and Claude kept on suggesting I should also add Claude Opus as a teacher lol, suggesting techniques I haven’t heard of like thinking inversion (train a small model to deconstruct summarised thinking into detailed native thinking format of the student).
So I can absolutely see and understand the concern around Fable’s frontier LLM development mitigations, but their approach of silently degrading is completely wrong and dangerous.
AI classifiers, like all AI, can make mistakes, and it’d only be a matter of time before it mis-fires and silently sabotaging a university’s HPC cluster for physics simulations or something because the shape looks like DeepSeek or whatnot to a dumb fast classifier.
Chu4eeno 1 month ago

There are some Claude datasets (of indeterminate provenance) floating around on huggingface you can look at (or at least used to be, they might've been taken down).

TheAceOfHearts 1 month ago

Someone should setup a plugin or something for Claude Code that makes it easy to log all inputs and outputs for people who are willing and interested in sharing their usage. I don't want Anthropic to be the only company that can train on my usage, I want to share my usage so it can be used for training all new models.

Once you have a system for collecting all logs, you just need a place where they can be submitted. Ideally it would be a freely licensed dataset that is publicly available for everyone.

Has anyone built this yet?

thomasfromcdnjs 1 month ago
Discussed building it with my friends, obviously you might share secrets and other real reasons, but if gangs of corporations are already doing it, I don't see why we shouldn't just share it amongst the crowd too.
- TheAceOfHearts 1 month ago
  
  Yeah I could see it being a problem if you're doing work on closed source or repos with sensitive credentials. Since my usage has all been on open source projects I'd be happy to share everything I'm doing if it can help train better models.
slaw 1 month ago

I don't mind being paid for using LLM, but working for free?
cush 1 month ago
Yikes, no thank you
- TheAceOfHearts 1 month ago
  
  Do you have a substantive reason why you dislike this? What is the problem if it's opt-in? Nobody is forcing you to share your usage if you don't want to.
  I'd prefer it if all the model builders could train on my usage rather than being limited to a single company. That'll hopefully help make all the models better in the long-term.
  
  1 reply →

20k 1 month ago

it sure sucks when people steal your hard work for free without paying for it doesn't it anthropic

danw1979 1 month ago

I’ve been thinking about what happens when Claude’s weights eventually get stolen. Wouldn’t that just open the door to the backmarkers to run inference-for-distillation on their own models ?

I guess the accusation that they’re using public access to the model via subscriptions indicates that weight theft probably hasn’t happened yet ?

Or maybe subsidised inference via subscriptions means it’s just cheaper do distill this was rather than stealing weights and running inference yourself ?

pyrale 1 month ago

Did Alibaba procure tons of stuff from Anthropic without paying, and use it to train a model?

I don't see the issue. Didn't Anthropic train on our data, which it acquired illegally?

lambdaone 1 month ago

The horse has bolted some time ago on this; the "frontier" is not as inaccessible as it once was, and open models, once out there, can't be put back in the bag.

Even if the US bans opens models, the Chinese and Russians will still have them, along with the rest of the world including cybersecurity attackers, and that's probably the worst-case scenario for the US.

The only way forward now is open models and how we restructure society around them.

gloosx 1 month ago

So why don't they proceed with a lawsuit instead of public accusations? Let the court decide if these "distillation attacks" are actually illicit.

redlewel 1 month ago

I see this as valid use, they are paying for the tokens to get this reasoning aren't they?

Obviously they didn't ask for permission when scraping all of libgen, reddit, all blog sites for FREE. When China pays for its use and does it I'm supposed to see it as some sort of problem?

Furthermore Chinese models getting better means we Americans might have the chance to use top tier AI without strict KYC built around it. Go Alibaba I say

lossolo 1 month ago

> Meanwhile, on June 12, two days after Anthropic sent the letter, the Commerce Department imposed controversial restrictions on Anthropic's latest Mythos and Fable AI models because officials feared they could be deployed by military intelligence users in China and other countries of concern.

So that was the real reason for the Fable restriction? Because Anthropic wrote a letter to the US government saying that China was distilling Fable?

igleria 1 month ago

https://en.wikipedia.org/wiki/Ali_Baba_and_the_Forty_Thieves

> In the original version, Ali Baba (Arabic: عَلِيّ بَابَا, romanized: ʿAliyy Bābā) is a poor woodcutter and an honest person who discovers the secret treasure of a thieves' den, and enters with the magic phrase "open sesame".

Open sesame alright...

rw2 1 month ago

This is making the case for Anthropic KYC for US citizens. No one would allow their accounts to do this if they were on the hook for it from the US government.

api 1 month ago

AI is awesome tech but it’s also to some extent mass piracy. The models are trained on huge amounts of material with dubious or non existent rights.

I have a hard time being concerned about “you pirated my piracy.”

I hold the view that many of these models should not be copyrightable. Anthropic and all the others talk about “safety” but you never hear them bring up attribution of the data that trained the model or compensation of anyone for it.

rvba 1 month ago

Why is it called "distillation" when it seems to be "scraping"? (as in web scraping)

When bots open the same board 1 million times per day it is web scraping to train the AI model and OK. When someone asks 150 thousand questions it is now distilling.

On an unrleated note, 150k qieries feels like nothing?

Scrapers seem to account for 50% total internet trafic.

Do they use different methodology since it is suddenly bad when scraping happens to them?

yogthos 1 month ago

So let me get this straight, a company which built its whole business on ignoring IP is all of a sudden upset that somebody is not respecting their IP?

a34729t 1 month ago

You know what? We should all get Claude Max subscriptions and max them out hard and post our full conversations on codeberg, as an open training set.

democracy 1 month ago

yc pitch?

krater23 1 month ago

Oh, Alibaba destilled data without consens out of Anthropics models that are trained with data from the internet without consens? Who cares?!

NoImmatureAdHom 1 month ago

⢰⣶⣶⣤⣄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⢻⣿⣿⡏⠉⠓⠦⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⣀⣀ ⠀⠀⢹⣿⡇⠀⠀⠀⠈⠙⠲⣄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⣠⡴⠖⢾⣿⣿⣿⡟ ⠀⠀⠀⠹⣷⠀⠀⠀⠀⠀⠀⠀⠙⠦⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⣤⠶⠚⠋⠁⠀⠀⣸⣿⣿⡟⠀ ⠀⠀⠀⠀⠹⣇⠀⠀⠀⠀⠀⠀⠀⠀⠈⠓⢦⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⡴⠖⠋⠁⠀⠀⠀⠀⠀⠀⠀⣿⣿⠏⠀⠀ ⠀⠀⠀⠀⠀⠙⣦⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⢦⡀⠀⠀⣀⣀⣀⣀⣀⣀⣀⣀⣀⣀⣀⠀⣀⡤⠖⠋⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣸⡿⠃⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠈⢳⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⠉⠉⠉⠁⠀⠀⠀⠀⠀⠀⠀⠀⠈⠉⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣰⡟⠁⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠙⢦⡀⠀⠀⢀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⡴⠋⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣦⣠⡿⠃⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠠⡄⠀⠀⢀⡴⠟⠁⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢸⠟⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢹⣦⠾⠋⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢠⠏⠀⠀⠀⠀⣠⣴⣶⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢠⡴⣶⣦⡀⠀⠀⠀⠀⠀⠹⣆⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⡏⠀⠀⠀⠀⠀⣯⣀⣼⣿⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣿⣄⣬⣿⡇⠀⠀⠀⠀⠀⠀⠘⣧⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⣼⠁⠀⠀⠀⠀⠀⠻⣿⡿⠏⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠘⠿⠿⠟⠀⠀⠀⠀⠀⠀⠀⠀⢹⣇⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⢀⡇⠀⢀⣀⣀⡀⠀⠀⠀⠀⠀⠀⠀⠀⢰⣷⣶⠤⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⢿⡀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⢸⢁⡾⠋⠉⠉⠙⢷⡄⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣴⠞⠋⠉⠛⢶⡄⠀⠀⠘⡇⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⣿⠸⣇⠀⠀⠀⠀⣸⠇⠀⠀⠀⠀⠀⢀⣠⠤⠴⠶⠶⣤⡀⠀⠀⠀⠀⠀⠀⣇⠀⠀⠀⠀⢀⡇⠀⠀⠀⢿⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⢿⠀⠉⠳⠶⠶⠞⠁⠀⠀⠀⠀⠀⠀⢾⡅⠀⠀⠀⠀⠈⣷⠀⠀⠀⠀⠀⠀⠙⠷⢦⡤⠴⠛⠁⠀⠀⠀⢸⡀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠈⣧⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⣤⡀⠀⠀⣠⠟⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⡇⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠘⣷⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠙⠛⠋⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣇⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠘⣇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢹⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡿⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢸⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢸⣇⣀⣀⣀⣠⣠⣠⣠⣠⣀⣀⣀⣀⣀⣀⣄⣄⣄⣄⣄⣠⣀⣀⣀⣀⣠⣠⣠⣠⣠⣠⣀⣀⣀⣀⣀⣼⡆⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⠀⠀⠀⠀⠀⠀⠀

toss1 1 month ago

Nevermind government edicts & bans -- this seems like reason enough for them to require Know Their Customers, require ID, and shut of certain nations.

Failing to have done so seems to have allowed 25000 fake Chinese accounts to walk off with their product...

OFC I wouldn't trust the Chinese enough to ack their models the time of day, but Anthropic seems to have allowed far more ... yikes

bozdemir 1 month ago

Oh wow !!! Antrophic always asks people indiviually if they can train on their personal data. I'm shocked ! Bad Aliba ! Bad....

Kuyawa 1 month ago

Our data, from the end users, has been harvested for decades by big corps and now they say it belongs to them? Oh teh irony!

seanclayton 1 month ago

They trained their AI on their AI. Anthropic trained their AI on a bunch of copyright-protected works. Sucks to suck, Dario!

zftnb666 25 days ago

Model distillation is illegal now? Better tell every AI company that used GPT outputs to train their models

dev1ycan 1 month ago

It's so funny how LLMs, which trained on millions of books, stolen (and even if they weren't, which they were, pirated from online pirate sites like libg and annas, they didn't have consent for the VAST majority of them), and stolen code, and stolen comments, etc.

Now complain about their stuff getting "stolen"... lol.

phplovesong 1 month ago

Heres my guesstimate on the future:

Companies like Anthopic will be using the same model as anyone else. They just bring value in having a fast datacenter and agent.

Its stupid to even think that a general model lile opus would be the real value.

Models age fast, new ones come along, and the end user wont care "whos model it is" just that it is fast and sharp.

egyptianblue 1 month ago

If the concern is that China is catching up on model capabilities (which is only a big deal if you lean in to adversarial geopolitical zero-sum thinking), the fact that they're using American models to train theirs should give people comfort that they're nowhere near the cutting edge

poulpy123 25 days ago

Oh no they chinese did on us what we did dit to 4000 years of us culture, it's really a shame !

ProAm 1 month ago

Says the company that is involved in the largest copyright heists of all time to build it's product.

robotburrito 1 month ago

Isn’t this fair game? Didn’t these companies basically steal to make these models to begin with?

onetrickwolf 1 month ago

“Distillation attack” are we joking here.

If anything these models should be compelled to be public since they have been trained off public data. What an absurd overreach to call this an attack.

It’s clear they are scapegoating national security and China at this point to build an anti-competitive moat.

I generally really like Anthropic’s work and models but stuff like this scares me for the future. We are positioning these companies to have too much power. The public’s life is getting worse while these companies consolidate power using data they stole from the public.

_fat_santa 1 month ago
> If anything these models should be compelled to be public since they have been trained off public data
I'm starting to come around to this idea TBH. For a while my position was: "these companies have invested billions into training these models, therefore they should be able to control them and profit off them" but looking deeper at where they got their training data, my view is starting to shift.
IMHO I feel like we need new laws around AI, specifically training data. Something like: "you can train an AI model and ignore copyright laws, BUT you must then make the model open weight", a company can still develop closed weight models but then they must aquire permission to use training data.
But it gets murky because if something like that was on the books then AI labs would just train open weight models and then distill them into their closed weight models.
- ivanovm 1 month ago
  
  labs invest multiple billion dollars a year each in private data, and that number is growing. internet training data is not where frontier capabilities come from, this view is outdated
  
  20 replies →
- threethirtytwo 1 month ago
  
  I'm not taking sides here but this situation is not so black and white and it has always been the darker side of capitalism.
  The concept of Intellectual property exists not because it's fair but because it creates incentive to make said "intellectual property" exist. If intellectual property can be instantly copied by a competitor... why would I spend a dime to even create such a thing? I want to profit off of what I make because I'm a capitalist and money is what drives me (as a capitalist).
  Anthropic models wouldn't exist if they couldn't keep a unholy grip on it. Same with openAI. Same with many life saving drugs.
  Of course everyone here is talking about the obvious stuff like how it's morally wrong to with-hold life saving drugs or to have AI literally take over the world and be under the control of one company and all of this is true. But it is also true that greed is the engine that drives our economy and if you want our economy to produce "intellectual property" you must allow people to "capitalize" on that greed.
  There are two controversial issues here. What is moral/fair? And what is realistically practical in optimizing the economy if said economy is based on money.
  The distillation in my mind is a win for practicality because Competition also drives our economic engine. First you don't want a monopoly, but you also don't want these models to be so damn open that there's zero incentive to make them.
  
  2 replies →
rafram 1 month ago
The core of the training data is public, but the part that actually makes these models smart came from (pretty highly-paid) experts via platforms like Mercor. Claude didn't magically learn to write good code by reading all of GitHub - humans trained it in that, more or less manually.
- rapind 1 month ago
  
  If you pay me to curate a playlist of musical hits, can you now publish and charge people for access to that playlist (*including the curated material)? Can we do the same with movies? Books?
  /edit Added a note to make it more obvious that the material is included in the playlist, just like the material is incorporated as part of curated AI models.
  
  2 replies →
- datsci_est_2015 1 month ago
  
  Given the breadth of LLM knowledge, I somehow doubt this. Sure, it’s probably responsible for the quality of LLM insights, but I don’t think anyone was asking experts about e.g. the complex ecological effects of invasive zebra mussels and their provenance in Lake Michigan.
- visarga 1 month ago
  
  No, they do RLVR (reinforcement learning with verifiable rewards) like everyone else. And probably use claude data too, with human in the loop and tool feedback.
- jaen 1 month ago
  
  ...and the rest of the training data (ie. the entire corpus of copyrighted works) was not written by experts expecting compensation? Double standards.
  
  17 replies →
- freejazz 1 month ago
  
  So? What about the authors of all the works these companies stole?
slibhb 1 month ago
> If anything these models should be compelled to be public since they have been trained off public data. What an absurd overreach to call this an attack.
> It’s clear they are scapegoating national security and China at this point to build an anti-competitive moat.
If all that is required to train these models is public data, why can't Alibaba just use that?
The fact that Alibaba has to resort to scraping Claude suggests there already is a moat...
- KerryJones 1 month ago
  
  This feels more nuanced than you are giving it credit for? Much of the training data that was available has been withdrawn, atleast for OpenAI we know that much of the training data was garnered in less-than above the board methods
cma 1 month ago

Since they hide their thinking traces it really doesn't make too much sense. We know one of their fixed degradations they talked about in a recent blog post was if you left claude code idle for too long they would rehydrate it without the thinking traces in the context and it degraded performance. So direct forms of distillation wouldn't be expected to get as good of results as they are getting.
However, they could have used it as a judge etc. during training.
flowerlad 1 month ago
Should Google search index be forced to be public too?
- calgoo 1 month ago
  
  Honestly, yes it should in some form. If their index contains the actual data from the sites, and they are making that information public in one way or another, then it should be available as a downloadable dataset.
  
  2 replies →
zobzu 1 month ago

its mainly just a lot cheaper. copying is always cheaper anyway, very little r&d - ai or no ai.
coliveira 1 month ago

What they're trying to do under the umbrella of "national security" is to legislate how we can use the results we pay for when accessing these models. This way they will control the "intellectual property" that was acquired illegally.
petilon 1 month ago
> If anything these models should be compelled to be public since they have been trained off public data.
Isn't that a bit like saying if you read books in a public library to pick up a new skill you should work for free?
> What an absurd overreach to call this an attack.
Would it be an attack to take your meal by force if you used a public recipe to prepare the meal?
- topgrain2 1 month ago
  
  > Isn't that a bit like saying if you read books in a public library to pick up a new skill you should work for free?
  Only if you’re trying to muddy the waters. No, obviously it’s not. One can also support licensing for driving a car on public roads but not for walking, even though both involve traveling. This is only confusing to people pretending to be confused, for effect.
  > Would it be an attack to take your meal by force if you used a public recipe to prepare the meal?
  “You wouldn’t download a car…” (unless it worked like copying an MP3, then, of course, you would, everyone would)
  It’s as if you’re using terrible analogies and comparisons because stronger ones don’t exist. Great news for the AI-should-be-open crowd.
  
  5 replies →
rapind 1 month ago
> It’s clear they are scapegoating national security and China at this point to build an anti-competitive moat.
They are also fear mongering (and getting shills to as well) the idea that once open weight (Chinese) models catch up to Mythos we're all doomed. Maybe I'd be bit less cynical if they weren't prepping for IPO?
Wasn't OpenAI spreading similar FUD back when GPT 2 came out?
Guys... AGI is right around the corner. Pinky swear. Now buy our stock.
Keep in mind that the entire US economy is currently propped up by AI spending, so a lot of people (banks, government) are incentivized to make sure these companies succeed. Expect this propaganda to ratchet up a notch if / when the economy starts to nose dive.
- ok123456 1 month ago
  
  Yes. They're turning on the consent manufacturing machine to make it an issue of "national security" to download some gguf file from Hugging Face. Absolutely disgusting.
msabalau 1 month ago
There's probably at 10-15% percent chance of a war between the US and China over the next 10 years. Maybe better than even chance of a militarized crisis that might have led to war, but somehow de-escalates.
Regardless of how sad late stage capitalism makes you, or how outrageous one claims to find "hypocrisy", any national security argument about limiting Chinese AI capability stands on it's own, at least for nations likely to be drawn into a war.
Also, all the local model enthusiasts who assume Chinese firms are going be allowed to endlessly release models if they have disruptive potential attributed to Mythos are probably in for a rude awakening. Just because the PRC is content about what has happened in the past doesn't mean that they would tolerate an open model that could be truly destabilizing.
- pseudony 1 month ago
  
  As a third party I would rather be happy about the way Chinese labs are acting in the here and now while US labs first masquerade as a public good, then turn around, bail on all promises of open AI, turn into a corporation and attempt to own the world while its runner-up is trying to scaremonger people into buying their product.
  I know most Americans are fed a steady diet of “evil China” and China MAY have issues. But on the AI front they are heaps better. Even if everything got closed tomorrow, we have a plethora of good models we can inspect and tweak while from the US labs we have… a single old 120b model ?
  And with the way the US is treating its allies, maybe a bunch of us are quite content with a more even match rather than US hegemony.
TZubiri 1 month ago
Two wrongs don't make a right
- tokioyoyo 1 month ago
  
  In this scenario it does, because consumers win.
  Everyone in AI industry wants to fight dirty, but gets angry when their competitor fights dirty as well. And I’ve mentioned it before, how I generally like Ant and its products.
- justapassenger 1 month ago
  
  Closest analogy to distillation is api reimplementation, without which current software industry wouldn’t exist.
  There’s nothing fundamentally wrong with distillation.
- moistoreos 1 month ago
  
  Pretty sure the second rectified the first.
rayiner 1 month ago
> The public’s life is getting worse while these companies consolidate power using data they stole from the public
How can you “steal” public information?
- calgoo 1 month ago
  
  really? You know this just like everyone else: Just because the information is available publicly, does not mean that you can do whatever you want with the information. Copyright exists for a reason, and if the copyright lobby is going to continue to push for the poor poor media companies to keep their copyrights, then we should do the same towards the AI companies. So yes, they Stole the information from everyone else, and they keep doing so, as you can see their scanners still hitting every website on the web to get an updated dataset. It does not matter what they do AFTER they steal all the information, as they already stole it.

anabis 1 month ago

Incentive is for users in general to release sessions (sans PII, credentials) so all AI get better and there is alternatives. Even if China didn't do this, I don't see frontier labs being able to charge premium over others for long. RSI maybe?

jackzhuo 1 month ago

most Chinese models are now open-source, whereas ppenai, claude, and gemini are closed; for example, deepdeek, the release of its every new model is accompanied by a corresponding research paper, and it now fully supports huawei's new chips.

bubblegumcrisis 1 month ago

When I was growing up, I thought "competition" was about better products. But looking at Google and Apple, Meta, and AI - "competition" is actually about creating monopolies through evil business practices.

Growing up with the birth of the internet - I really did think it would be a force for transferring power and authority to the people. Sigh, I was I so wrong.

Where are the companies that declare, "we will be the best, come at us!"

Where are the politicians who are supposed to represent us? Oh, right. I forgot for a moment.

freeopinion 1 month ago

Wallace Shawn was in on the joke when he expertly delivered the original line. It seems like Anthropic has spent years and billions of dollars to recreate the entire scene.

But what will become of the princess in Anthropic's recreation?

AdieuToLogic 1 month ago

The hypocrisy of Anthropic complaining about "illicitly extracting its Claude AI model capabilities" and supporting the White House's accusation of China "stealing U.S. AI labs' intellectual property on an industrial scale" is hilarious.

Anthropic, OpenAI, Google, Microsoft, et al trained their models by ignoring the rights of copyright holders when harvesting whatever content they could. Now one of them is crying foul for another entity doing exactly what they all did?

Hilarious.

protimewaster 1 month ago
The AI companies seem to take the viewpoint that everything on the internet is free, except their stuff. It's okay to hammer some random website with AI crawlers, ignoring robots.txt, and causing bandwidth costs to skyrocket. But if you cost an AI provider money with your data acquisition practices, well, that's just clearly unacceptable.
- eunos 1 month ago
  
  Anthropic, Dario especially seems have eternal grudge against China as a concept, that remind me of Thiel.
  
  5 replies →
- xdennis 1 month ago
  
  That's one aspect, which is a bit of a gray zone. But Anthropic trained on pirated books. That is explicitly illegal.
  
  7 replies →
- AdieuToLogic 1 month ago
  
  > But if you cost an AI provider money with your data acquisition practices, well, that's just clearly unacceptable.
  It's the same question libertarian advocates cannot resolve:
  If one truly believes in personal sovereignty, how are shared resources paid for, such as roads, power grids, potable water, sewage services, fire departments, and police departments?
  It is also not a coincidence that leadership in many tech companies have expressed libertarian ideals.
  
  19 replies →
- WarmWash 1 month ago
  
  >The AI companies seem to take the viewpoint that everything on the internet is free,
  The AI companies? That's been the common ethos of the internet for 40 years
  I mean, raise your hand if you ad block and have a hard drive of pirated content...
- MagicMoonlight 1 month ago
  
  [dead]
amanaplanacanal 1 month ago

It's not exactly the same, since any Claude output is public domain under current law. So the Chinese aren't stealing anything here.
deaton 1 month ago

Not really even in the same ballpark as what they did. These other labs are using AI generated content (which has already been ruled un-copyrightable) to train their models. Oh and they are paying for those tokens. So at absolute worst, they are violating the terms of service. The horror. Meanwhile these frontier AI labs pirated and scraped everything they possibly could, paying not a dime to the copyright owners, nor paying anything to the websites they DDoSed.
tom2026hn 1 month ago

What's yours is mine, and what's mine is still mine.
mannanj 1 month ago

there's no honor among thieves.
inquirerGeneral 1 month ago

[dead]
1f60c 1 month ago
Not really.
Data mining for AI is presumably fair use, whereas when you sign up for a Claude account, you enter into a legally binding contract that says you will not distill a model based on its outputs.
- amanaplanacanal 1 month ago
  
  I guess they can try to sue. Good luck.
- iAMkenough 1 month ago
  
  “Legally binding” bs that a judge would laugh off.

dminik 1 month ago

This is supposed to be negative, but all I can really think of is "Good."

ycui7 1 month ago

in a few more months, when Chinese model gets to Mythos capacity and Fable still locked down. What Anthropic will say? Why can they just admit they are not the only people who know how to train an LLM model.

otikik 1 month ago

If it's out there on the internet it's ok to use it for training, independently of what the licenses or the TOS say.

If not, then we should look at Alibaba, but we should look at Anthropic as well.

qsxfthnkp2322 23 days ago

LOL shared accounts are the norm.

Pool together a few. Pay for one account. Share the love for cheaper than single people are allowed to pay.

gaiagraphia 1 month ago

A company which got rich on extracting the world's content is complaining that another company has extracted their work?!

LOL!

Get a grip, son.

jryan49 1 month ago

So they can train on everyone's copyrighted works to create their model, but when someone trains a model off their model it's not okay? Seems kind of hypocritical.

AndreasMoeller 1 month ago

Unless you own stock in Anthropic, this is a good thing right?

elzbardico 1 month ago

As a Open Source contributor who was never asked by Anthropic or OpenAI if they could use my work in their training datasets, this sounds so deliciously ironic.

guluarte 1 month ago

Anthropic training their models full of copyright data, so?

tarruda 25 days ago

Hopefully this distillation will lead Alibaba to release more powerful open weights LLMs, contributing to the democratization of AI.

chriskanan 1 month ago

And all those reports of Claude when asked without a system prompt what its name was in Chinese it often would say Qwen or Deepseek, etc. I'd love Anthropic to say they aren't distilling and taking from every model out there, because I'm sure they are. As my mom would say, "the pot calling the kettle black." At least Alibaba and other Chinese companies are giving back to the AI community with detailed scientific papers on how their systems work and releasing open-weight or opensource models. I believe Anthropic has released nothing, and given that they had originally configured Fable to sabotage ML related work because only they can be trusted to do it safely, is just anti-science and anti-aligned with what I would consider good human values. They are way too sanctimonious and I don't trust them at all.

i2km 1 month ago

Couldn't anthropic just use fable to find security holes in Alibaba's systems and poison their models?

Or maybe there's been a bit too much hype...

senordevnyc 1 month ago

I wonder if some clever comedian here will make the very original joke that Anthropic is "getting a taste of their own medicine".

monegator 1 month ago

Soon, when even the enterprise subscriptions will have ads, every session will begin with a mandatory generated image:

> you would NEVER distill a model..

awkwabear 1 month ago

Wait so they're upset that people used their IP to train a model without their consent or paying them anything?

or is this just about the token reselling?

c0rruptbytes 1 month ago

if they’re paying for the tokens, what’s the problem

cmiles8 1 month ago

Funny how Anthropic doesn’t like when people just steal their stuff, with that stuff made using IP they (allegedly) stole from others.

theplumber 1 month ago

Let’s hope they distilled it properly so we can have the best of both worlds: a decent model to work with without Anthropic’s drama.

uberex 1 month ago

Hey, Alanis Morissette, this one is ironic.

watutalkinbout 1 month ago

An AI company stealing intellectual property?!

Oh, the inhumanity!

bparsons 1 month ago

Where did Anthropic get all their training data? Funny that these companies care about the sanctity of IP all of a sudden.

KennyBlanken 1 month ago

willy wonka oh-go-on-dot-gif

Gosh, overusing accounts running up unplanned-for expenses?

Kinda reminds me of...overusage charges and inflated expenses clients have had to deal with because Anthropic, OpenAI, Grok, etc have been "illicitly extracting" everything they can grab from said websites, as fast as they can. In what amounts to a DDOS, frankly.

krembo 1 month ago

Similar to improving an independent search engine by scraping Google search results and learning from it. Shady but legit

mattpetters 25 days ago

The great irony of crying IP theft as a LLM lab lol. Not so fun when you're on the receiving end eh

klustregrif 24 days ago

You are trying to kidnap what I have rightfully stolen, and I think it quite ungentlemanly!

retinaros 25 days ago

they even published papers where they distill gpt models…

https://alignment.anthropic.com/2025/subliminal-learning/

anthropic is really adversarial.

iFire 1 month ago

As far as I know, American copyright law has ruled large language model output has no copyright status.

exe34 1 month ago

Anthropic also trained on all human knowledge, so I'm okay with others distilling their models.

asasidh 1 month ago

People in glass houses shouldn't throw stones. Anthropic keeps throwing stones every few weeks.

seydor 1 month ago

It's not fair when others do it.

octocop 1 month ago

Isn't this done in the open? I saw the Qwopus model the other day. Basically same thing?

tonyoconnell 1 month ago

The narrative is moving towards KYC

nonethewiser 1 month ago

Im all for it.

Grimblewald 1 month ago

Claude thinks it's chatGPT, and various chinese models sometimes, whats up with that?

hit8run 1 month ago

Thieves stealing from each other? No way. Guess on what anthropic trained their data.

_3u10 1 month ago

The outputs belong to whoever purchased them. What are they complaining about?

jp0001 1 month ago

If they paid for the tokens, then is it really stealing or just learning?

anhtudev 1 month ago

People prefer Chinese models to US models. Looks like it is a counterattack.

digitaltrees 1 month ago

Call the wambulance a company that stole all of humanities public data to train a model is mad that someone used their model to train another model.

Give me a break. Every employee of anthropic is going to have $20m or more at the IPO.

I found out today that an employee of the home care agency I own is homeless. We are trying to figure out how to help her but it's shockingly common in the industry and there are limited resources to solve the reality of working homelessness.

bfjvibybd6cuvu6 1 month ago

To quote an infamous cop in the UK, I don't think you are mate.

xhinker2 1 month ago

If Anthropic’s accusation is substantiated — if using another model’s outputs to train your own model is considered “illicit extraction” — then everyone in the AI industry is guilty.

If you fine-tuned a model on GPT-4 outputs, you distilled GPT-4. If you used Claude to generate training data for your classifier, you distilled Claude. If you learned anything from any model’s outputs and used that to improve your own system or your own brain, you distilled it. The line between “learning” and “distilling” is non-existent. Intelligence is distillation. That’s literally how learning works — you expose yourself to high-quality outputs, internalize patterns, and generate your own.

If I use Anthropic’s model to learn and train my own brain, am I also distilling their model?

The accusation confuses learning with theft.

https://xhinker.medium.com/pot-calling-the-kettle-black-why-...

tagyro 1 month ago

And Anthropic illicitly used code I wrote to train their models.

psychoslave 1 month ago

In an other news, a terrorist organization practicing torture at daily level just released a public denunciation of the evil forces they are fighting against, guided by their holy mission of making progress in social morality for all of us.

dev_l1x_be 1 month ago

Alibaba did a research on Anthropic capabilities? Interesting.

truthbe 1 month ago

How do I donate my logs

haritha-j 1 month ago

Oh gee, I've misplaced my world's smallest violin.

viktorcode 1 month ago

Sounds like an advertising for the next model from Alibaba

Pxtl 1 month ago

"You're trying to kidnap what I've rightfully stolen!"

DrewADesign 1 month ago

“Hey! Haven’t you heard that two wrongs don’t make a right?!”
- Entitled jerk that initially wronged people
HarHarVeryFunny 1 month ago

You're teaching your parrot to say what our parrot is saying!

ForHackernews 1 month ago

I'll play the world's smallest violin for Dario

steve_woody 1 month ago

Let me join the concert

delduca 1 month ago

Internet says Anthropic illicitly extracted content

johnnyApplePRNG 1 month ago

No group is more paranoid than a den of thieves.

podgorniy 1 month ago

Something something about benefiting all humanity

budududuroiu 1 month ago

Has anyone else noticed that Deepseek v4 running in Claude Code will try to read, list, tail as many files/logs/... as it can for even the most simple tasks?

grayhatter 1 month ago

Oh no, someone is profiting of the work of others?!

anyways...

Madmallard 1 month ago

Sounds like fair game considering Claude is built upon the theft of creative assets of the entire world and aims to eliminate white collar jobs entirely

throawayonthe 1 month ago

i illicitly ate oatmeal this morning

pixel_popping 1 month ago

Illegal or just against their ToS?

meindnoch 1 month ago

What goes around, comes around.

sscaryterry 1 month ago

What goes around, comes around.

sarafiq 1 month ago

What goes around comes around!

aaa_aaa 1 month ago

Haha cry us a river Anthropic.

freejazz 1 month ago

Why would it not be fair use?

rayiner 1 month ago

This is why I don’t understand the concerns about “our AI overlords” monopolizing all the gains from AI. It doesn’t seem like there’s much of a moat around the models themselves. So the race is mainly about compute. But compute is subject to power law effects. I remember Intel building the first Teraflop computer (ASCI red) in 1996. It was the size of a house. By 2014 you had more compute and 50% more memory in an off the shelf dual processor server system.

InkCanon 25 days ago

The openness of AI is currently being held up only by Chinese companies (previously Meta, but they stopped). They're not saints, but there is not even a question in the open weight/HF community that the immense mass of Chinese talent, knowledge and resources are the only thing stopping a monopoly/duopoly from forming. In a very Cathedral vs Bazaar-esque way, China severely lacks compute, but are extremely ingenious in coming up with new optimizations, architectures, etc, which they all detail in their papers.

asadm 1 month ago

is there a good recipe or guide on doing a successful distillation these days?

gigioc 1 month ago

Please, honor among thieves!

4d4m 25 days ago

define illicit when your product is trained on copyrighted material?

fennecfoxy 1 month ago

I mean I believe in protecting your company's IP, but IP and patent law is absurd these days, designed to protect investors and their fake money rather than actual inventors (who usually get no proceeds/are shafted).

They trained from the internet, so if someone trains from them it's fair game. Their clever tech should be in the mechanism with which it uses to provide an answer, not the answer itself.

Groxx 1 month ago

Perhaps this is related to the "Mythos is too dangerous and cannot be exported" movements? It'd be a fairly effective way to justify extreme actions in combating it.

One could even wonder if they requested it, as a tactic to support their eventual IPO valuation.

Which is part of the problem of such an obviously-corrupt government: conspiracy theories are somewhat reasonable, as they keep getting validated.

quantum_state 1 month ago

Here it goes again ...

andai 1 month ago

We have Claude at home!

OtomotO 1 month ago

Karma truly is a bitch

demchaav 25 days ago

Why i not surprised

casey2 24 days ago

Once again the "safety first®" lab caught with their pants down. Crying foul with little evidence. I'm glad LLMs aren't dangerous at all, since Anthropic has repeatedly demonstrated their inability to understand basic cyber security.

cindyllm 24 days ago

[dead]

drdrek 1 month ago

This is like a Gardner complaining that you watch him as he works to learn his craft. My dude you do not have to take the job, but most people just accept it as the way the world works. If they feel like they do not want to serve the Chinese they can do that on their own, why do they need the government?

leentee 1 month ago

What I get from this is frontier model capabilities are being stagnant.

serverlessmania 1 month ago

And Alibaba is releasing the full model weights open source under Apache 2.0, Anthropic… fuck that company.

hirako2000 1 month ago

Karma is a thing.

stego-tech 1 month ago

I'm sorry, but I can't stop laughing at an AI company crying about theft of their IP.

unnouinceput 1 month ago

Oh, c'mon. If Alibaba wanted, it can have the entire Claude/Mythos source code and data by next week. All you need is enough bribe to a developer that has access to the repository. Humans are always the weakest link in anything.

catigula 1 month ago

It's a bit disappointing when people see this as a schadenfreude moment because it's clearly not safe nor a good precedent for potentially dangerous AI models to trivially fall into the hands of malicious actors.

watwut 1 month ago

How dare they! Only we should be illicitely extracting everything others done!

/Anthropic-probably

matheusmoreira 1 month ago

Please. These AI companies scraped everything under the sun. It's only fair that they get distilled into open weights models. Their own models should have been open weight from the start.

Ainaguade 1 month ago

"The distinction between downloading pirated copies vs. scanning physical books is fascinating — same data, different legal outcome. Copyright law really wasn't built for this era."

itvision 1 month ago

Seeking a monopoly on its business. And it's not just the Chinese, its their US competitors as well.

Sorry, Anthropic, but AGI must belong to all of humanity, not just to you.

nicman23 1 month ago

fucking lol. it is always funny when companies use opensource and other free for non commercial use - and plain old piracy - and then cry about the same practices.

anonbuddy 1 month ago

the biggest irony of 21st

heyaco 25 days ago

they say 40% of the ai engineers are asian. why not just go home and build the next empire in china? there is a reason there is no asian in hollywood or silicon valley. they will just use you and never give you the spotlight. especially now with the rise of china. just wait until the propoganda starts. you will feel more loved and welcomed back home.

winddude 1 month ago

ooohhh nooo... anyway...

secretslol 1 month ago

Another day, another excuse as to why Fable 5 was pulled. Just waiting for Anthropic saying the Persona partnership was the fault of the Chinese.

bilsbie 1 month ago

Can we finally just nope out of this closed model of AI development?

It should all be open source with each gain shared and celebrated by all.

witx 1 month ago

F Anthropic in the back port

bwfan123 1 month ago

Model makers need to get off their high-horse, and face the reality that they are selling a commodity.

dainiusse 1 month ago

I am sorry, but companies doing biggest IP theft in history have no moral right to complain here.

xela79 1 month ago

I would say Antrophic and others illicitly extracted free internet content and put it behind a paywall, giving zero compensation to those that made their whole business possible in the first place. So smallest violin player busy here trying to make me care if it happens to them.

PunchyHamster 1 month ago

Being absolute ass to entire internet as you scour everything with no regard to common protocols - fine

Getting treated exactly same by competition - "we need rapid, coordinated action among industry players, policymakers and the global AI community."

Absolute scum. And the gall of going "oh buh it can be used for military, quick govt do something".

nacozarina 1 month ago

Thieves complaining about theft and then gaslighting the victims; rich, but not smooth.

snickerbockers 1 month ago

So NOW the hypocrites are demanding permission to train a model on THEIR data.

mityda 1 month ago

I like some of the Alibaba products, wtf do they know about AI models???

dolebirchwood 1 month ago

Good. I'm glad. Keep it up, China. Loving my cheap GLM and DeepSeek.

mityda 1 month ago

I like some of the Alibaba products, wtf they know about AI models???

youknownothing 1 month ago

laughs in ironic

Freedumbs 1 month ago

This article is absurd for an outlet who published an article that's meant to be news not editorial. Reuters was once a news wire and is still considered that. The first two paragraphs refer to "attack" and "strike" against Anthropic. This is sensational nonsense, not news. There was no strike, or attack. Block the accounts. Why is this news, and why are they pandering to the people who just banned the new model they burned at least $10 billion training? The closer you look at this AI stuff the more absurd it is. I assume the strat is to keep the bubble floating until post-2028, then drop the bomb on the Dem who wins. Just like with the covid inflation + economic rigging Trump did in 2016-2020.

cakeface 1 month ago

More distillation please. This is only good for me.

bridgettegraham 1 month ago

lol. good for the chinese. I hope their models get better than the closed american ones quick so we can stop using "controlled" models.

lt-runtime 1 month ago

eh.. Anthropic wants open-weight models gone.

johnwheeler 1 month ago

Well, of course they did. Are you kidding?

randomfrogs 1 month ago

"They stole our stolen data!"

1a527dd5 1 month ago

"Hypocrisy, thy name is you"

johnnyevert 1 month ago

The pot calling the kettle black.

rogermungo 1 month ago

The Pot calling the Kettle black

vips7L 1 month ago

Booohooo the people who stole everything they have want to cry about having what they produced stolen???

8note 1 month ago

so what? anthropic stole this functionality from everyone else

JasonHEIN 1 month ago

we now know what to use when Fable is too dangerous !

impartshadow 25 days ago

[flagged]

cws_ai_buddy 1 month ago

[flagged]

nsoonhui 1 month ago

[flagged]

yashthakker 1 month ago

[flagged]

Anoian 1 month ago

[dead]

ElenaDaibunny 1 month ago

[dead]

gdst218 17 days ago

[dead]

uymbybumby 1 month ago

[dead]

ProjectVader 1 month ago

[flagged]

z0ltan 1 month ago

[dead]

animanoir 1 month ago

[dead]

hereme888 1 month ago

[flagged]

Mr_Xpes 1 month ago

[flagged]

rochak 1 month ago

Ok boomer

scotty79 1 month ago

Nobody cares. Grow up.

soundworlds 1 month ago

How the hell does Anthropic continue to make such hypocritical complaints without deeply cringing?

It's becoming embarrassing to watch

emsign 1 month ago

And that's coming from the intellectual property thieves. Laughable. Let the Chinese steal the models, they will only make it cheaper for everyone.

rsynnott 1 month ago

Oh, _now_ we care about IP, do we?

delta_p_delta_x 1 month ago

Cue Jeremy Clarkson's 'Oh no! Anyway...' GIF.

PostOnce 1 month ago

Suppose Anthropic trained only on data they paid to create, and not the internet or stolen textbooks.

It would still be extremely difficult to muster any sympathy for an organization whose MO is to go public not to honestly raise capital to fund growth and development, but rather to dishonestly leave someone else holding the bag, in some cases involuntarily as their retirement funds are passively invested.

And even supposing they were honest and didn't have an IPO, it would still be extraordinarily difficult to care about their misfortune, because "consolidating all thought-work into the hands of those few who can afford frontier models and datacenters and power plants" is also a special kind of misanthropy.

And even if that were not the case, they're filthy rich already, so who gives a shit if the Chinese companies prevent them from becoming quadrillionaires? :)

phplovesong 1 month ago

Why are they mad about this? Its not like they did not commit the biggest IP theft in modern history when training their models?

nullc 1 month ago

Anthropic extracted millions of words of my own writing even more illicitly for they did not do so through an API provided for that purpose while paying me in the process.

irthomasthomas 1 month ago

Ask claude it's name in chinese and it thinks its Qwen (opus) or Deepseek (sonnet). Anthropic are just as guilty as everyone else training AI, today, maybe more so. Every lab borrows from every other. It only takes a few hundred samples to figure out the pattern; look at glm-5.2 reasoning using the caveman tongue of gpt-5.5. Stopping this would require some draconian surveillance.

kgeist 1 month ago
That's not how it works though. When you prepare the conversations for distillation, it's the most trivial and obvious first step to replace "Qwen" with "Claude" and vice versa. I doubt they'd simply forget to do it.
A model may misidentify itself due to the surrounding context. When a model is about to answer "I'm ...", what follows is a sorted list of probabilities for what the next token should be. In most models it's usually a list of popular model names: say, in the list, first comes Claude, then Qwen, then ChatGPT etc. Usually the "Claude" token would be the most probable token, say 70%. But if the surrounding context is in Chinese, the embeddings for "something to do with China" may nudge the combined embedding of the output token towards the "Qwen" embedding more ("China+Claude=Qwen" in the embedding space). Say, the probability for "Qwen" now becomes 60% instead of 10%.
If we also use high temperature for more "creativity", the token sampler now may choose "Qwen". It's not the most probable token still, but it was chosen because selecting the 2nd most probable token once in a while usually allows a model to explore unexpected "creative" paths, and 60% probability is good enough compared to 70%. It's basically a hallucination.
I once made an experiment: if I ban the word "Qwen" in the inference engine entirely, and ask Qwen "which model are you?", it happily starts announcing it's Claude 100% time, simply because "Claude" is the next most probable token after "Qwen" in this context.
- irthomasthomas 25 days ago
  
  > If we also use high temperature for more "creativity", the token sampler now may choose "Qwen".
  If that was the cause then, like you said, it would sometimes pick Claude. But it doesn't, it consistently picks Deepseek (sonnet) and Qwen (opus). You can run it 100 times and see this behaviour much more than high temperature randomness would predict.