Comment by trymas

13 hours ago

> Anthropic's IP was created by harvesting and "distilling" other people's IP. Copyrighted materials, and the commons... which they have essentially privatized.

Anthropic and others argue that because LLMs don’t output full copyrighted works word for word - hence their LLMs aren’t infringing on copyright laws.

I think (if this ever comes to that) Chinese lab should use same arguments against Anthropic.

UPDATE: this is slight hyperbole of course, not worth arguing what they actually said. The point is intent and the facts - "The Big LLMs" "distilled" collective knowledge including copyrighted works at unimaginable scale, but it's all kosher and totally not piracy/copyright infringement. Though if you're teenager torrenting an mp3 - you'll get screwed.

5 comments

trymas

nutjob2 13 hours ago

> LLMs don’t output full copyrighted works word for word

Apparently they do, as per the evidence in the NYT vs OpenAI suit.

Hamuko 13 hours ago

Isn’t the output of LLMs completely copyright-free in the US?

nutjob2 12 hours ago
One lower court has said that the output of AI models is uncopyrightable.
But the real unsettled issue is if model training is fair use, and where copyright infringement might creep in to model output.
- camgunz 12 hours ago
  
  The copyright office itself also says this when it talks about determining authorship.

gspr 12 hours ago

> Anthropic and others argue that because LLMs don’t output full copyrighted works word for word - hence their LLMs aren’t infringing on copyright laws.

That surely can't be what they argue, because I'm sure I can't translate a copyrighted book into a different language and say "that's fine, it's not word-for-word".