Comment by Alupis

3 years ago

Not every website is the same, folks.

> You are literally complaining about handling 1 request every 2 seconds

I don't know where this came from. The inconsiderate bots tend to flood your server, likely someone doing some sort of naïve parallel crawl. Not every website has a full-stack in-house team behind it to implement custom server-side throttles and what-not either.

However, like I mentioned already, if every single visitor is getting the challenge, then either the site is experiencing an attack right now, or the operator has the security settings set too high. Some commonly-targeted websites seem to keep security settings high even when not actively experiencing an attack. To those operators, remaining online is more important than slightly annoying some small subset of visitors 1 time.

> crawls every single webpage (and we have 10's of thousands) every couple days

100,000 / (86400 * 2) = 0.58 req/sec.

I acknowledge that those requests are likely bursty, but you were complaining as if the total amount was the problem. If the instantaneous request rate is the actual problem, you should be able to throttle on that, no?

I can totally believe your site has a bunch of accidental complexity that is harder to fix than just pragmatically hassling users. But it'd be better for everyone if this were acknowledged explicitly rather than talked about as an inescapable facet of the web.

  • > But if the instantaneous request rate is the problem, you should be able to filter on that, no?

    Again, not every website is the same, and not every website has a huge team behind it to deal with this stuff. Spending 30-something developer hours implementing custom rate limiting and throttling, distributed caching, gateways, etc is absurd for probably 99% of websites.

    You can pay Cloudflare $0.00 and get good enough protection without spending a second thinking about the problem. That is why you see it commonly...

    If your website does not directly generate money for you or your business, then sinking a bunch of resources into it is silly. You will likely never experience this sort of challenge on an ecommerce site, for instance... but a blog or forum? Absolutely.

    • Actually I get hassled all the time on various ecommerce sites. Because once centralizing entities make an easy to check "even moar security" box, people tend to check it lest they get blamed for not doing so. And then it gets stuck on since the legitimate users that closed the page out of frustration surely get counted in the "attackers protected against" metric!

      I'd say you're really discounting the amount of hassle people get from these challenges. Some sites hassle users every visit. Some hassle users every few pages. Some hassle logged in users. Some just go into loops (as in OP). Some don't even pop up a challenge and straight up deny based on IP address!

      And since we're talking about abstract design, why can't Cloudflare et al change their implementations to throttle based on individual IPs, rather than blanket discriminating against more secure users? Maybe you personally have taken the best option available to you. But that doesn't imply the larger dynamic is justifiable.

      3 replies →

In my experience, if bots start flooding a server, it's the ISP/hosting provider that gets angry and contacts the owner first. )