Comment by lysace

25 days ago

At some point they must become more cost efficient by pure market economics mechanisms. That implies less load on sites. Much of the scraping that I see is still very dumb/repetative. Like Googlebot in like 2001.

(Blocking Chinese IP ranges with the help of some geoip db helps a lot in the short term. Azure as a whole is the second largest source of pure idiocy.)

3 comments

lysace

incompatible 25 days ago

They seem to have so much bubble money at the moment that the cost of scraping is probably a rounding error in their pocket change.

tdeck 24 days ago

So the cost of caching should be a rounding error as well. If The Internet Archive can afford to cache vast swathes of the web, then surely the big AI companies can do so.
lysace 25 days ago

Exactly.