The word "their" is overloaded, it could mean "thing I have the legal right to", or, "thing I have in my possession right now".
The latter condition is clearly true. It's their data.
If you pretend the other definitions of possession don't exist and claim "aktually it's not theirs they don't have rights to it" then that's on you for faking an incomplete understanding of language.
Well, but if it’s the latter definition, then the AI didn’t train on their data, since the companies took possession of that data before doing a training run.
It’s only the former definition that would allow an AI model to have been trained on someone else’s data
> It’s only the former definition that would allow an AI model to have been trained on someone else’s data
There are yet more definitions of "theirs". For example, data whose provenance can be traced back to Anna's Archive.
So the data is legally owned by the book authors, possessed by Anna's Archive, and downloaded for training usage by the AI companies. Every person in that chain could, linguistically speaking, correctly refer to the data as "theirs", or refer to the data of a different entity as "theirs".
"but if you download something under a license that doesn't grant you ownership, then it isn't yours."
Possession is 9/10 of the law - if you have a copy, you have possession, and thus you have SOMETHING and LEGALLY it is considered yours (now whether you legally obtained it is a different story and THAT is where charges stem from.)
Well, it is their data.
The word "their" is overloaded, it could mean "thing I have the legal right to", or, "thing I have in my possession right now".
The latter condition is clearly true. It's their data.
If you pretend the other definitions of possession don't exist and claim "aktually it's not theirs they don't have rights to it" then that's on you for faking an incomplete understanding of language.
Well, but if it’s the latter definition, then the AI didn’t train on their data, since the companies took possession of that data before doing a training run.
It’s only the former definition that would allow an AI model to have been trained on someone else’s data
> It’s only the former definition that would allow an AI model to have been trained on someone else’s data
There are yet more definitions of "theirs". For example, data whose provenance can be traced back to Anna's Archive.
So the data is legally owned by the book authors, possessed by Anna's Archive, and downloaded for training usage by the AI companies. Every person in that chain could, linguistically speaking, correctly refer to the data as "theirs", or refer to the data of a different entity as "theirs".
It's their servers sure, but if you download something under a license that doesn't grant you ownership, then it isn't yours.
You are being granted a license to use the data.
Yes, exactly, if you ignore all definitions of "yours" that involve possession then it isn't "yours".
But no one else is obligated to ignore the definitions of words that you're choosing to ignore, so the rest of us will go on saying it's their data.
7 replies →
"but if you download something under a license that doesn't grant you ownership, then it isn't yours."
Possession is 9/10 of the law - if you have a copy, you have possession, and thus you have SOMETHING and LEGALLY it is considered yours (now whether you legally obtained it is a different story and THAT is where charges stem from.)
1 reply →
Their data about not their work