← Back to context

Comment by tpmoney

1 day ago

And distributing an AI model trained on that text is neither distributing the work nor a modification of the work, so the GPL (or other) license terms don't apply. As it stands, the courts have found training an AI model to be a sufficiently transformative action and fair use which means the resulting output of that training is not a "copy" for the terms of copyright law.

> And distributing an AI model trained on that text is neither distributing the work nor a modification of the work, so the GPL (or other) license terms don't apply.

If I print an harry potter book in red ink then I won't have any copyright issues?

I don't think changing how the information is stored removes copyright.

  • If it is sufficiently transformative yes it does. That’s why “information” per se is not eligible for copyright, no matter what the NFL wants you to think. No printing the entire text of a Harry Potter book in red ink is not likely to be viewed as sufficiently transformative. But if you take the entirety of that book and publish a list of every word and the frequency, it’s extremely unlikely to be found a violation of copyright. If you publish a count of every word with the frequency weighted by what word came before it, you’re also very likely to not be found to have violated copyright. If you distribute the MD5 sum of the file that is a Harry Potter book you’re also not likely to be found to have violated copyright. All of these are “changing how the information is stored”.