← Back to context

Comment by lysace

1 day ago

At some point they must become more cost efficient by pure market economics mechanisms. That implies less load on sites. Much of the scraping that I see is still very dumb/repetative. Like Googlebot in like 2001.

(Blocking Chinese IP ranges with the help of some geoip db helps a lot in the short term. Azure as a whole is the second largest source of pure idiocy.)

They seem to have so much bubble money at the moment that the cost of scraping is probably a rounding error in their pocket change.

  • So the cost of caching should be a rounding error as well. If The Internet Archive can afford to cache vast swathes of the web, then surely the big AI companies can do so.