Comment by plaidfuji

6 hours ago

It’s the exact same mental gymnastics that cause people to accuse model providers of large-scale plagiarism.

That is to say, not that much gymnastics. Like a cartwheel at most.

2 comments

plaidfuji

I don't really agree with those guys either.

The reason is fairly straightforward: there's no alternative if you need the dataset.

And use in LLMs is transformative, so it would fall under fair use. The only reason they're in trouble with the courts at the moment from my understanding is that they pirated the content instead of idk, ripping it from Libby.

MrDOS 5 hours ago

Anna's Archive aren't filing the serial numbers off the epubs they redistribute. Rightfully or wrongly distributed, the attribution is crystal clear.