Comment by Retr0id
8 days ago
Last time I checked, Anubis used SHA256 for PoW. This is very GPU/ASIC friendly, so there's a big disparity between the amount of compute available in a legit browser vs a datacentre-scale scraping operation.
A more memory-hard "mining" algorithm could help.
A different algorithm would not help.
Here's the basic problem: the fully loaded cost of a server CPU core is ~1 cent/hour. The most latency you can afford to inflict on real users is a couple of seconds. That means the cost of passing a challenge the way the users pass it, with a CPU running Javascript, is about 1/1000th of a cent. And then that single proof of work will let them scrape at a minimum hundreds, but more likely thousands, of pages.
So a millionth of a cent per page. How much engineering effort is worth spending on optimizing that? Basically none, certainly not enough to offload to GPUs or ASICs.
No matter where the bar is there will always be scrapers willing to jump over it, but if you can raise the bar while holding the user-facing cost constant, that's a win.
No, but what I'm saying is that these scrapers are already not using GPUs or ASICs. It just doesn't make any economical sense to do that in the first place. They are running the same Javascript code on the same commodity CPUs and the same Javascript engine as the real users. So switching to an ASIC-resistant algorithm will not raise the bar. It's just going to be another round of the security theater that proof of work was in the first place.
2 replies →
Sorry but it is actually a completely wrong solution. This is not a "make it expensive for spammers" problem. They only need to download the source code once, with a background process that doesn't matter if it takes hours.
Besides the "got the source code for training data" , the other access scenario is just downloading to an end users "agent" Which again, the end user is running something in the background, doesn't care how long it takes, how much it costs, its not a volume or spam type problem