Comment by Micanthus
8 hours ago
The page specifically says it's okay for bots to scrape from Anna's Archive, she just asks they do it in bulk to not overload the servers:
"""
> We are a non-profit project with two goals:
> 1. Preservation: Backing up all knowledge and culture of humanity.
> 2. Access: Making this knowledge and culture available to anyone in the world (including robots!).
[. . .]
* Our website has CAPTCHAs to prevent machines from overloading our resources, but all our data can be downloaded in bulk:
* All our HTML pages (and all our other code) can be found in our [GitLab repository](https://software.annas-archive.gl/).
* All our metadata and full files can be downloaded from our [Torrents page](/torrents), particularly `aa_derived_mirror_metadata`.
* All our torrents can be programatically downloaded from our [Torrents JSON API](https://annas-archive.gl/dyn/torrents.json)."""
No comments yet
Contribute on Hacker News ↗