Comment by netcan
13 hours ago
Hypocrisy is a form of corruption.
Anthropic's IP was created by harvesting and "distilling" other people's IP. Copyrighted materials, and the commons... which they have essentially privatized.
The commercial goal is to avoid competition. One of the main worries for AI is "commoditization" which has come to mean "not a monopoly." To that end, it doesn't matter is the competitor is Chinese American or other.
Their motivation here is clearly protectionism. The argument they make to politicians is national security. The legal argument is IP-theft, violation of service agreements or whatnot.
This is all very dangerous. Commercial interests repackaged as national security can lead to armed conflict.
> Anthropic's IP was created by harvesting and "distilling" other people's IP. Copyrighted materials, and the commons... which they have essentially privatized.
Anthropic and others argue that because LLMs don’t output full copyrighted works word for word - hence their LLMs aren’t infringing on copyright laws.
I think (if this ever comes to that) Chinese lab should use same arguments against Anthropic.
UPDATE: this is slight hyperbole of course, not worth arguing what they actually said. The point is intent and the facts - "The Big LLMs" "distilled" collective knowledge including copyrighted works at unimaginable scale, but it's all kosher and totally not piracy/copyright infringement. Though if you're teenager torrenting an mp3 - you'll get screwed.
> LLMs don’t output full copyrighted works word for word
Apparently they do, as per the evidence in the NYT vs OpenAI suit.
Isn’t the output of LLMs completely copyright-free in the US?
One lower court has said that the output of AI models is uncopyrightable.
But the real unsettled issue is if model training is fair use, and where copyright infringement might creep in to model output.
1 reply →
> Anthropic and others argue that because LLMs don’t output full copyrighted works word for word - hence their LLMs aren’t infringing on copyright laws.
That surely can't be what they argue, because I'm sure I can't translate a copyrighted book into a different language and say "that's fine, it's not word-for-word".
Bad China is stealing our stolen IP!
And putting it into free models like quen. It’s hard to care about this.
"Copyright violation of a published work" and "stealing private trade secrets" are in fact very different crimes.
Humans have spent millenia harvesting and distilling each other's IP - "the shoulder of giants" and all that, so it's an especially disingenuous take.
For something to be a trade secret, you have to actually keep it secret. If I get the ingredients of Coca-cola from an ex-employee, I've stolen a trade secret. If I work it out by doing a chemical analysis, I've stolen nothing.
There is a difference with anthropic, as no-one signs a licence agreement to buy a coke. But Anthropic are also not saying you can't publish the output of their models. It's not clear to me if trade secret law will (or should) cover a secret which can be extracted from information that licensees are not restricted from publishing.
Wait, really? So why doesn't someone just reverse-engineer Coca-Cola like that? My understanding was that a "clean room" implementation is fine, but not reverse-engineering. If you can just copy everything on the market, why isn't someone already doing that?
6 replies →
> Humans have spent millenia harvesting and distilling each other's IP
You maybe somewhat correct, but also copyright lawyers wouldn’t have work if it would be up for grabs to take others IP willy nilly just because “shoulders of giants and all that”.
I mean, there's an obvious difference between "distributing copies" (which is what the law was designed to prevent) and "training an LLM". We already managed "banning LLM output that contains copyrighted text" - it's much easier to just pirate a copy of the text. So I think the copyright lawyers will continue to have work as long as human written texts are worth buying.
2 replies →