Comment by ninalanyon

1 year ago

What else could it be?

4 comments

ninalanyon

An original composition based on a statistical analysis of the training data. Statistical data about a copyrighted work obviously isn't necessarily a derivative of that work. Otherwise Tolkien could sue me for telling you how many times The Lord of the Rings uses the word "the".

rasz 1 year ago
Can it reproduce training data? Then its not analysis but compression, lossy compression.
- Ajedi32 1 year ago
  
  For most LLMs, with most works, no.
  If you trained an LLM repeatedly on nothing but the text of LOTR until it could re-produce the books verbatim and then tried to sell copies of that LLM, then I agree that would be blatent copyright infringement, yes.

monocasa 1 year ago

The industry is banking on Author's Guild v. Google to be precedent in such a way that it's functionally transformative enough to be a completely new work.

https://en.wikipedia.org/wiki/Authors_Guild,_Inc._v._Google,....

I think they have about a coin flip of a chance that it passes muster in the courts.