Comment by akoboldfrying

3 months ago

The (almost only?) distinguishing factor between genuine users and bots is the total volume of requests, but this can still be used for asymmetric costs. If botPain > botPainThreshold and humanPain < humanPainThreshold then Anubis is working as intended. A key point is that those inequalities look different at the next level of detail. A very rough model might be:

botPain = nBotRequests * cpuWorkPerRequest * dollarsPerCpuSecond

humanPain = c_1 * max(elapsedTimePerRequest) + c_2 * avg(elapsedTimePerRequest)

The article points out that the botPain Anubis currently generates is unfortunately much too low to hit any realistic threshold. But if the cost model I've suggested above is in any way realistic, then useful improvements would include:

1. More frequent but less taxing computation demands (this assumes c_1 >> c_2)

2. Parallel computation (this improves the human experience with no effect for bots)

ETA: Concretely, regarding (1), I would tolerate 500ms lag on every page load (meaning forget about the 7-day cookie), and wouldn't notice 250ms.

6 comments

akoboldfrying

tptacek 3 months ago

That's exactly what I'm saying isn't happening: the user pays some cost C per article, and the bot pays exactly the same cost C. Both obtain the same reward. That's not how Hashcash works.

akoboldfrying 3 months ago
I'm saying your notion of "the same cost" is off. They pay the same total CPU cost, but that isn't the actual perceived cost in each case.
- tptacek 3 months ago
  
  Can you flesh that out more? In the case of AI scrapers it seems especially clear: the model companies just want tokens, and are paying a (one-time) cost of C for N tokens.
  Again, with Hashcash, this isn't how it works: most outbound spam messages are worthless. The point of the system is to exploit the negative exponent on the attacker's value function.
  
  3 replies →