Comment by pulkitsh1234

9 months ago

I am curious, where did the other companies (OpenAI, Anthropic, et al) get their training data from? Why is only Meta under fire for this?

It's not just books; most websites technically don't allow scraping content, but most of the content on which these models trained was scraped from the web. It's legality is still an open question.