Comment by simondotau

4 hours ago

My base advice is to make sure you have a very efficient code path for login pages. 10 pages per second is nothing if you don’t have to perform any database queries (because you don’t have any authentication token to validate).

Beyond that, look for how the bots are finding new URLs to probe, and don’t give them access to those lists/indexes. In particular, don’t forget about site maps. I use cloudflare rules to restrict my site map to known bots only.

Of course. My server wasn't struggling with that. I haven't benchmarked that server, but on an M1 Max, the app can easily serve hundreds of requests per second for profile pages, which is the heaviest thing an unauthenticated user can access (I cache a lot in memory, but posts, photos, and friend lists aren't among that). It was just a mild annoyance.

They discovered those URLs simply by parsing pages that contain like buttons. Those do have rel="nofollow" on them, and the URL pattern is disallowed in robots.txt, but I'd be surprised it that'd stop someone who uses thousands of IPs to proxy their requests. I don't have a site map.