Comment by gpm
4 days ago
Google goes around legally scanning every book they can get their hands on with books.google.com. Legally scanning every paper they can get their hands on with scholar.google.com.
I doubt they'd resort to piracy for what is basically the same information as what they've already legally acquired...
That is a good reason to think they did not but it doesn't necessarily override reasons for them to do so. Perhaps it's dubious that the subset of data they could not legally get their hands on is an advantage for training but I really don't know, and maybe nobody does. Given that, Google's execs may have been in favor of similar operations as Facebook's and their lawyers may have been willing to approve them with similar justifications.
Downloading a torrent isn't piracy if you are a license holder for the information that you are downloading.
*If the license you have authorizes you to make a copy in that fashion.
But here, Google isn't a license holder. Google doesn't license the text in Google Books (unless something has changed since the lawsuits). Google simply legally acquires (buys, borrows, etc) a copy of the book and does things with it that the US courts have found are fair use and require no license.
Incidentally I believe the French courts disagreed and fined them half a million dollars or so and ordered them to stop in France.