← Back to context

Comment by dspillett

1 day ago

Is there a public dump of the data anywhere that this is based upon, or have they scraped it themselves?

Such as DB might be entertaining to play with, and the threadedness of comments would be useful for beginners to practise efficient recursive queries (more so than the StackExchange dumps, for instance).

Yes, you can see the download HN bash script in the repository now that simply extract the data to your local machine from BigQuery and saves it as a series of gzip JSON files

  • Ah, the repo was 404ing for me last time I checked (seems fine now) so I couldn't inspect that. I'll have a play later.