← Back to context

Comment by sugarpimpdorsey

3 days ago

A lot of these passive types of anti-abuse systems rely on the rather bold assumption that making a bot perform a computation is expensive, but isn't for me as an ordinary user.

According to whom or what data exactly?

AI operators are clearly well-funded operations and the amount of electricity and CPU power is negligible. Software like Anubis and nearly all its identical predecessors grant you access after a single "proof". So you then have free reign to scrape the whole site.

The best physical analogy are those shopping cart things where you have to insert a quarter to unlock the cart, and you presumably get it back when you return the cart.

The group of people this doesn't affect are the well-funded, a quarter is a small price to pay for leaving your cart in the middle of the parking lot.

Those that suffer the most are the ones that can't find a quarter in the cupholder so you're stuck filling your arms with groceries.

Would you be richer if they didn't charge you a quarter? (For these anti-bot tools you're paying the electric company, not the site owner.). Maybe. But if you're Scrooge McDuck who is counting?

Right, that's the point of the article. If you can tune asymmetric costs on bots/scrapers, it doesn't matter: you can drive bot costs to infinity without doing so for users. But if everyone's on a level playing field, POW is problematic.

I like your example because the quarters for shopping cards are not universal everywhere. Some societies have either accepted shopping cart shrinkage as an acceptable cost of doing business or have found better ways to deter it.

Scrapers are orders of magnitude faster than humans at browsing websites. If the challenge takes 1 second but a human stays on the page for 3 minutes, then it's negligible. But if the challenge takes 1 second and the scraper does ita job in 5 seconds, you already have a 20% slowdown

  • By that logic you could just make your website in general load slower to make scraping harder.

    • No, because in this case there are cookies involved. If the scraper accepts cookies then it's trivial to detect it and block it. If it doesn't, it will have to solve the challenge every single time.

  • Scrapers do not care about having a 20% slowdown. All they care is being able to scale up. This does not block any scale up attempt.