Comment by 1659447091
8 hours ago
There is a github of the factbook for anyone that just wants JSON or markdown files:=> https://github.com/factbook
"A cache for datasets for the country profiles from the World Factbook in the original (1:1) format from the cia.gov website"
Hi there, thanks for linking this! My GitHub and website both link to and use this source! I just thought putting it in a SQL database and making the entire 1990-2025 queryable was needed since I couldn't find one anywhere :)
it is a lot of fun and rewarding to do this! I've done it several times for medium-sized datasets, like wikipedia dumps, the entire geospatial dataset to mapreduce it (pgsql). The wikipedia one was great, i had it set up to query things like "show me all ammunition manufactured after 1950 that is between .30 and .40" and it could just return it nearly instantly. The wikimedia dumps keep the infoboxes and relations intact, so you can do queries like this easily.
Do you have a write-up of this somewhere? When I last looked at the Wikipedia dumps, they looked like a mess to parse. How were you getting structured information?