Comment by debugnik
3 days ago
That's not bypassing it, that's them finally engaging with the PoW challenge as intended, making crawling slower and more expensive, instead failing to crawl at all, which is more of a plus.
This however forces servers to increase the challenge difficulty, which increases the waiting time for the first-time access.
Obviously the developer of Anubis thinks it is bypassing: https://github.com/TecharoHQ/anubis/issues/978
Fair, then I obviously think Xe may have a kinda misguided understanding of their own product. I still stand by the concept I stated above.
latest update from Xe:
> After further investigation and communication. This is not a bug. The threat actor group in question installed headless chrome and simply computed the proof of work. I'm just going to submit a default rule that blocks huawei.
5 replies →
The point is that it will always be cheaper for bot farms to pass the challenge than for regular users.
Why does that matter? The challenge needs to stay expensive enough to slow down bots, but legitimate users won't be solving anywhere near the same amount of challenges and the alternative is the site getting crawled to death, so they can wait once in a while.
It might be a lot closer if they were using argon2 instead of sha. Sha is a kind of bad choice for this sort of thinh.
Too bad the challenge's result is only a waste of electricity. Maybe they should do like some of those alt-coins and search for prime numbers or something similar instead.
Most of those alt-coins are kind of fake/scams. Its really hard to make it work with actually useful problems.
Of course that doesn't directly help the site operator. Maybe it could actually do a bit of bitcoin mining for the site owner. Then that could pay for the cost of accessing the site.
this only holds through if the data to be accessed is less valuable than the computational cost. in this case, that is false and spending a few dollars to scrape data is more than worth.
reducing the problem to a cost issue is bound to be short sighted.
This is not about preventing crawling entirely, it's about finding a way to prevent crawlers from repeatedly everything way too frequently just because crawling is just very cheap. Of course it will always be worth it to crawl the Linux Kernel mailing list, but maybe with a high enough cost per crawl the crawlers will learn to be fine with only crawling it once per hour for example
my comment is not about preventing crawling, its stating that with how much revenue AI is bringing (real or not), the value of crawling repeatedly >>> the cost of running these flimsy coin mining algorithms.
At the very least captcha at least tries to make the human-ai distinction, but these algorithms are just purely on the side of making it "expensive". if its just a capital problem, then its not a problem for these big corpo who are the ones who are incentivized to do so in the first place!
even if human captcha solvers are involved, at the very least it provides the society with some jobs (useless as it may be), but these mining algorithms also do society no good, and wastes compute for nothing!