← Back to context

Comment by stingraycharles

5 hours ago

> There is no reason do this, because there are far better ways to exfiltrate data from Anthropic models. Chinese companies are already doing this at an industrial scale where they are reselling Claude tokens for 10-20% of the cost while retaining the data to train their own models.

I think you missed the part where Anthropic stopped displaying their thinking tokens over the past few months, and instead now provides “summarized thinking”, letting Haiku summarize Opus’ thoughts.

So it is now much more difficult (impossible?) to distill the models.

I also think you over-estimate how well the legal systems works in the US nowadays, and under-estimate just how much power Elon has in the government.

At this scale thinking tokens don't matter anymore.

In Feb Anthropic called out three Chinese labs for "distillation attacks", but a lab missing in their post actually had most Claude generated tokens among all Chinese labs in their midtrain data :p

Incidentally, Claude started the "Summarized Thinking" bullshit around a year ago.

Deepseek kept its pace of improvement nonetheless.