Comment by namenotrequired
12 hours ago
> There are plenty of old books in the public domain already
Yes but showing that it happens in books in the public domain does nothing to prove that it happens for copyrighted books
12 hours ago
> There are plenty of old books in the public domain already
Yes but showing that it happens in books in the public domain does nothing to prove that it happens for copyrighted books
"Same difference," as the saying goes. If their claims are true then you can make the model recite "lorem ipsum" or anything else that's long and has nonzero entropy.
It’s not the same. Presumably public domain works are much more frequently shared on the public internet and therefore much more common in the training set
The difference is that one of them is completely fine, and the other is a crime.