Comment by danieltanfh95

6 months ago

this only holds through if the data to be accessed is less valuable than the computational cost. in this case, that is false and spending a few dollars to scrape data is more than worth.

reducing the problem to a cost issue is bound to be short sighted.

2 comments

danieltanfh95

r0uv3n 6 months ago

This is not about preventing crawling entirely, it's about finding a way to prevent crawlers from repeatedly everything way too frequently just because crawling is just very cheap. Of course it will always be worth it to crawl the Linux Kernel mailing list, but maybe with a high enough cost per crawl the crawlers will learn to be fine with only crawling it once per hour for example

danieltanfh95 6 months ago

my comment is not about preventing crawling, its stating that with how much revenue AI is bringing (real or not), the value of crawling repeatedly >>> the cost of running these flimsy coin mining algorithms.
At the very least captcha at least tries to make the human-ai distinction, but these algorithms are just purely on the side of making it "expensive". if its just a capital problem, then its not a problem for these big corpo who are the ones who are incentivized to do so in the first place!
even if human captcha solvers are involved, at the very least it provides the society with some jobs (useless as it may be), but these mining algorithms also do society no good, and wastes compute for nothing!