Comment by __alexander

4 days ago

Care to share the scrapped data? I would love to play around with it.

I'm surprised he got that much data. Goodreads uses several tricks to try to stop scrapers, for example pagination only works up to a few pages.

  • They might send him a bill for use of resources.

    • I’m wondering about how ethical it is to load down a resource in this way, open to opinions. There is a mention “I didn’t hammer down the servers” but what does that really even mean? The site isn’t being used as intended and just curious how other people feel about that.

I am not sure about legal side of things here, but a Kaggle dataset would be really cool