Comment by hattmall

16 days ago

Perhaps it wasn't a widespread norm though. But I don't really see why that matters as much, is the the issue that sites with robots.txt today only allow Googlebot and not other search engines? Or is Google somehow benefitting from having two decade old content that is now blocked because of robots.txt that the website operators don't want indexed?

Agree. It was not standard in the late 90s or early 00s. Most sites were custom built and relied on the _webmaster_ knowing and understanding how robots.txt worked. I'd heard plenty of examples where people had inadvertently blocked crawlers from their site, not knowing the syntax correctly. CMS' probably helped in the widespread adoption e.g. wordpress