Comment by nottorp
1 day ago
Okay, it's been established that "AI" crawlers are a pest. One of the reasons being that they don't actually run any "AI", that would be too expensive.
You can't ban by user agent because that will only catch the few crawlers that are actually honest about it.
Aren't there rate limiting solutions built into at least some web servers? At least if you control your own web server, can't you do it through some reverse proxy?
Cut off IPs that make more than NN requests in a minute? Require some kind of login to allow more, if you do have endpoints that are designed to be bulk hit?
There should be ready made solutions for this still. In spite of the current answer being "lulz it's too hard, just use cloudflare".
No comments yet
Contribute on Hacker News ↗