Comment by virgildotcodes
16 hours ago
Here's a probably stupid question - if someone were unbounded by ethics and conceivably had enough power and connections to power to shield themselves from many consequences of their actions - and that person owned these DCs, could they in theory observe all the streams of tokens coming in and out of these models, and even exfiltrate copies of these models wholesale to have their own teams do what they will with them in the pursuit of building their own competitive models?
Or is there something fundamental in the way these models get deployed (encryption or something or than legal contracts?) at this scale that prohibits the owners of the infra from gaining this level of insight / access?
1) The situation you described would be covered under the contract between Anthropic and xAI, and that any violation of that would be subject to financial penalties and legal proceedings. The US has a robust corporate legal system, and disputes do get resolved through the court system, although in a slow and costly manner.
The contract can stipulate a penalty at a high enough amount to discourage this behavior.
2) Output from models & intra-datacenter communications can be encrypted if customers truly cared.
3) There is no reason do this, because there are far better ways to exfiltrate data from Anthropic models. Chinese companies are already doing this at an industrial scale where they are reselling Claude tokens for 10-20% of the cost while retaining the data to train their own models. https://www.chinatalk.media/p/how-to-buy-cheap-claude-tokens...
If we look at Deepseek V4-pro, created by Deepseek who Anthropic formally accused of harvesting Claude tokens at scale, it performs the same as Claude did 6 months prior.
> The US has a robust corporate legal system
Thanks for the chuckle. ;)
It's an accurate statement. The US (specifically Delaware) has a world-class corporate law system. Delaware has a court dedicated to only corporate and commercial disputes, with 200+ years of case law.
It does, though.
2 replies →
> There is no reason do this, because there are far better ways to exfiltrate data from Anthropic models. Chinese companies are already doing this at an industrial scale where they are reselling Claude tokens for 10-20% of the cost while retaining the data to train their own models.
I think you missed the part where Anthropic stopped displaying their thinking tokens over the past few months, and instead now provides “summarized thinking”, letting Haiku summarize Opus’ thoughts.
So it is now much more difficult (impossible?) to distill the models.
I also think you over-estimate how well the legal systems works in the US nowadays, and under-estimate just how much power Elon has in the government.
At this scale thinking tokens don't matter anymore.
In Feb Anthropic called out three Chinese labs for "distillation attacks", but a lab missing in their post actually had most Claude generated tokens among all Chinese labs in their midtrain data :p
Incidentally, Claude started the "Summarized Thinking" bullshit around a year ago.
Deepseek kept its pace of improvement nonetheless.
I guess there's a reason why those Chinese companies are in china
I hope the Chinese keep harvesting Claude etc. since they stole all their data anyway, who cares?
It's already in the public domain (thanks to the OpenAI trial) that Grok distilled OpenAIs models. Listening to the data going into the models in the data centre would be very similar thing. There's some downsides (you're passively listening, not controlling the queries), and some upsides (way more data). But it only ever gets you to some percentage of the existing production model. It doesn't get you what Musk wants - an AI company capable of designing and deploying leading edge models. It gets you to fast follower status.
You could, but Grok is pretty high up there, it might not be "#1" but its definitely up there with the giants, people seem to overlook it. Gemini has a similar problem, it was #1 once, and it seems like Google isn't hell bent on chasing #1 they just want to keep iterating over time, they know they just need it to be "good enough" and they'll keep having repeat customers.
If Elon REALLY wanted to do anything like that he would be better off poaching talent from competitors, less legal hell to go through.
No Grok is not 'up there'. Not by any stretch of the imagination is it anywhere close to anthropic or openai in any single domain whatsoever. Not even close to deepseek. It really isnt a good model. Their research team's talent is just not good, unfortunately.
If I was training a top-tier model, having a competitors’ weights feels like it would be an excellent refining tool. As a user I find cross-model iteration to be a huge power-tool, and presumably the boffins can zero in on areas of relative strength and weakness and work out which area they want massive amounts of synthetic data from.
Bear in mind these data centers were built for X.ai to use themselves, so there would have been no reason to backdoor them. Also, this is all off-the shelf equipment: Dell & Supermicro servers with NVidia compute modules and ethernet switches.
xAI had a lot of negotiating power here because Anthropic had ~0 comparable options and ultimately desperately needed the compute now. So, it wouldn't surprise me if data sharing was an explicit part of the agreement
What prevents a data center operator from reading your chats? [FEATURE] Provide a way to select your data center #56916 https://github.com/anthropics/claude-code/issues/56916
Yes he could do that and I thought about this too. Very weird tbh
There's accusations that the Chinese labs have done essentially that to OpenAI and Anthropic and exfiltrated their models without having DC access, so if you had DC access, yes, you could do that. If you had DC access though you could just copy the model onto an SSD.
The Chinese labs have been accused of training on elicited chat logs on a massive scale in violation of ToS. That's possibly a real concern, but it's nowhere close to "exfiltrating" the model or even roughly matching its behavior.