← Back to context

Comment by marginalia_nu

3 years ago

Dude, my struggle is to be able to operate a non-profit service free of charge with no ads, and the aim is to help users find a way out of exactly the aforementioned digital mall and onto the free and independent, mostly static web.

This is harder every day dude to all the bot traffic helping themselves to disproportionate usage of my service via a botnet that is all but indistinguishable from human traffic.

But yeah, must be the corporate profits I'm hoarding...?

Why is it relevant whether the traffic is human or automated? The whole point of the internet is you can put a server out there and anyone anywhere can connect to it with any HTTP client.

To me it seems like the only people who care about that are those who want to sell our attention to the highest bidder via advertising. Wouldn't you be having the same difficulties if there were just as much traffic coming from humans?

  • I want to provide as many human beings as possible with value by distributing my processing power fairly between them. If I get DDoS:ed by a botnet, I won't provide anyone with anything other than optimistically an error page.

    If I had infinite money and computing resources, this would be fine, but I'm just one guy with a not very power computer hosted on domestic broadband, and even though I give away compute freely, it just takes one bag of dicks with a botnet to use it all up for themselves, and without bot mitigation, I'm helpless to prevent it.

    Oh and I actually do provide an API for free machine access, so it's not like they have to use headless browsers and go through the front door like this. But they still do.

    Serves me right for trying to provide a useful service I guess?

    • Arguably, the problem here is that you want to do it free of charge. That's the problem in general: adtech aside, people want to discriminate between "humans" and "bots" in order to fairly distribute resources. What should be happening though, is that every user - human and bot alike - cover their resource usage on the margin.

      Tangent: there's a reason the browser is/used to be called an user agent. The web was meant to be accessed by automation. When I use a script to browse the web for me with curl, that script/curl is as much my agent as the browser is.

      I see how remote attestation and other bot detection/prevention techniques make it cheaper for you to run the service the way you do. But the flip side is, those techniques will get everyone stuck using shitty, anti-ergonomic browsers and apps, whose entire UX is designed to best monetize the average person at every opportunity. In this reality, it wouldn't be possible to even start a service like yours without joining some BigCo that can handle the contractual load of interacting with every other business entity...

      (Also need I remind everyone, that while the first customers of remote attestation are the DRM-ed media vendors, the second customer is your bank, and all the other banks.)

      1 reply →

    • I see. I respect that.

      The bot detection won't come without cost. It will centralize power in the hands of Cloudflare and other giants. I think it's only a matter of time until they start exercising their powers. Is this really an acceptable tradeoff?

      If we do accept it, I think the day will come when Cloudflare starts rejecting non-Chrome browsers, to say nothing of non-browser user agents.

      6 replies →

    • Is this a purposeful DDoS or just bots trying to scrape results? If this is a DDoS on purpose, what's their financial gain? Did they demand payment?

      If you're talking about bots scraping content, then the question is also why. Perhaps by letting them do so, you indirectly provide even more human beings?

      It's entirely possible that these questions are absurd, however, since scraping using headless browsers is not free, then there must be some reason for scraping a given service... and it's usually something that in the end benefits more human beings.

      2 replies →

  • >Why is it relevant whether the traffic is human or automated?

    because all traffic costs for the service provider, but the automated traffic can be run at thousands of users cheaper than it is to run one human user (who after all is bounded by time and cost of computation and bandwidth) whereas the automated is not bound by time, giving them the opportunity to DOS you - either on purpose or just accidentally.

    • But the solution for this is a rate limit, not captcha. The real reason they care about "human traffic" is because bots don't buy stuff.

      4 replies →