← Back to context

Comment by throwawayffffas

3 days ago

> That's what it's for, isn't it? Make crawling slower and more expensive.

The default settings produce a computational cost of milliseconds for a week of access. For this to be relevant it would have to be significantly more expensive to the point it would interfere with human access.

I thought the point (which the article misses) is that a token gives you an identity, and an identity can be tracked and rate limited.

So a crawlers that goes very ethically and does very little strain on the server should indeed be able to crawl for a whole week on a cheap compute, one that hammers the server hard will not.

  • Sure but it's really cheap to mint new identities, each node on their scrapping cluster can mint hundreds of thousands of tokens per second.

    Provisioning new ips is probably more costly than calculating the tokens, at least with the default difficulty setting.

...unless you're sus, then the difficulty increases. And if you unleash a single scrapping bot, you're not a problem anyway. It's for botnets of thousands, mimicking browsers on residual connections to make them hard to filter out or rate limit, effectively DDoSing the server.

Perhaps you just don't realize how much did the scraping load increase in the last 2 years or so. If your server can stay up after deploying Anubis, you've already won.

  • How is it going to hurt those?

    If it's an actual botnet, then it's hijacked computers belonging to other people, who are the ones paying the power bills. The attacker doesn't care that each computer takes a long time to calculate. If you have 1000 computers each spending 5s/page, then your botnet can retrieve 200 pages/s.

    If it's just a cloud deployment, still it has resources that vastly outstrip a normal person's.

    The fundamental issue is that you can't serve example.com slower than a legitimate user on a crappy 10 year old laptop could tolerate, because that starts losing you real human users. So if let's say say user is happy to wait 5 seconds per page at most, then this is absolutely no obstacle to a modern 128 core Epyc. If you make it troublesome to the 128 core monster, then no normal person will find the site usable.

    • It's not really hijacked computers, there is a whole market for vpns with residential exit nodes.

      The way i think it works is they provide free VPN to the users or even pay their internet bill and then sell the access to their ip.

      The client just connects to a vpn and has a residential exit IP.

      The cost of the VPN is probably higher than the cost for the proof of work though.

    • > How is it going to hurt those?

      In an endless cat-and-mouse game, it won't.

      But right now, it does, as these bots tend to be really dumb (presumably, a more competent botnet user wouldn't have it do an equivalent of copying Wikipedia by crawling through its every single page in the first place). With a bit of luck, it will be enough until the bubble bursts and the problem is gone, and you won't need to deploy Anubis just to keep your server running anymore.