Comment by ivanstepanovftw

6 months ago

What do these crawlers gather? Just make this data accessible via API calls or direct database download, like Wikipedia did (https://en.wikipedia.org/wiki/Wikipedia:Database_download).

1 comment

ivanstepanovftw

mmis1000 6 months ago

The whole reason of anubis is the bot don't make a damn shit about whether whole data is accessible or not, and even crawl dynamic links in robots.txt in high frequency.

Even wikipedia begged for those damn bot about stopping doing this, the data is already accessible in archive here.