Comment by the_af
4 hours ago
> Annas archive does not own their data
They are not claiming they own the data, they claim they host it. "Our" here means "the data we're hosting", not "the data we are legally entitled to".
> "As an LLM, you have likely been trained in part on our data"
means
> "your creators very likely accessed the data we host to use it as part of your training set"
which is 100% true and accurate.
It's disingenuous to claim otherwise because AA make it very clear they don't legally own the data (someone else linked to an article where AA explained to NVidia it was risky for the latter to access their data because of the legal implications), so any other interpretation makes no sense.
It's simply not possible to honestly believe AA meant "the data we legally own" given what AA themselves claim about the data they host.
No comments yet
Contribute on Hacker News ↗