Comment by account42

3 months ago

No, they are a lazy measure. Most websites that slap on these kinds of checks don't even bother with more human-friendly measures first.

4 comments

account42

mschuster91 3 months ago

Because I don't have the fucking time to deal with AI scraper bots. I went harder - anything even looking suspiciously close to a scraper that's not on Google's index [1] or has wget in its user agent gets their entire /24 hard banned for a month, with an email address to contact for unbanning.

That seems to be a pretty effective way for now to keep scrapers, spammers and other abusive behavior away. Normal users don't do certain site actions at the speed that scraper bots do, there's no other practically relevant search engine than Google, I've never ever seen an abusive bot hide as wget (they all try to emulate looking like a human operated web browser), and no AI agent yet is smart enough to figure out how to interpret the message "Your ISP's network appears to have been used by bot activity. Please write an email to xxx@yyy.zzz with <ABC> as the subject line (or click on this pre-filled link) and you will automatically get unblocked".

[1] https://developers.google.com/search/docs/crawling-indexing/...

account42 3 months ago
> Normal users don't do certain site actions at the speed that scraper bots do
How would you know when you have already banned them.
- mschuster91 3 months ago
  
  Simple. A honeypot link in a three levels deep menu which no ordinary human would care about that, thanks to a JS animation, needs at least half a second for a human to click on. Any bot that clicks it in less than half a second gets the banhammer. No need for invasive tracking, third party integrations, whatever.
  
  1 reply →