Comment by bawolff
3 days ago
> This… makes no sense to me. Almost by definition, an AI vendor will have a datacenter full of compute capacity. It feels like this solution has the problem backwards, effectively only limiting access to those without resources or trying to conserve them.
Counterpoint - it seems to work. People use anubis because its the best of bad options.
If theory and reality disagree, it means either you are missing something or your theory is wrong.
Counter-counter point: it only stopped them for a few weeks and now it doesn’t work: https://news.ycombinator.com/item?id=44914773
Geoblocking China and Singapore solves that problem, it seems, at least the non-residential IPs (though I also see a lot of aggressive bots coming from residential IP space from China).
I wish the old trick of sending CCP-unfriendly content to get the great firewall to kill the connection for you still worked, but in the days of TLS everywhere that doesn't seem to work anymore.
Only Huawei so far, no? That could be easy to block on a network level for the time being
Of course we knew from the beginning that this first stage of "bots don't even try to solve it, no matter the difficulty" isn't a forever solution
AliCloud also seems to send a more capable scraper army, but so far they're not using botnets ("residential proxies") to hide their bad practices.