Comment by IlikeKitties
4 days ago
In fact, facebook torrented annas archive and got busted for it, because of course they did:
https://torrentfreak.com/meta-torrented-over-81-tb-of-data-t...
4 days ago
In fact, facebook torrented annas archive and got busted for it, because of course they did:
https://torrentfreak.com/meta-torrented-over-81-tb-of-data-t...
Every LLM maker probably did the same. Facebook just has disgruntled employees who leaked it
Google goes around legally scanning every book they can get their hands on with books.google.com. Legally scanning every paper they can get their hands on with scholar.google.com.
I doubt they'd resort to piracy for what is basically the same information as what they've already legally acquired...
That is a good reason to think they did not but it doesn't necessarily override reasons for them to do so. Perhaps it's dubious that the subset of data they could not legally get their hands on is an advantage for training but I really don't know, and maybe nobody does. Given that, Google's execs may have been in favor of similar operations as Facebook's and their lawyers may have been willing to approve them with similar justifications.
Downloading a torrent isn't piracy if you are a license holder for the information that you are downloading.
1 reply →