Comment by supriyo-biswas

1 year ago

Google does get a pass, since they use Googlebot to scrape content, but then look at the robots.txt for "Google-Extended" to voluntarily decide if they can use said content for LLM training[1].

I assume Microsoft intends to do the same, given they have Bing and their recent stance on the matter[2].

[1] https://developers.google.com/search/docs/crawling-indexing/...

[2] https://www.businesstoday.in/technology/news/story/microsoft...