Comment by rsecora
1 year ago
The dataset consists of books from the Anna Archive, each identified by an ISBN. The ISBNs and titles are extracted from datasets [1], which include magazines and books primarily in Chinese, English, and French.
Example: Germany publishes five times more books than the Netherlands [2], and Spain publishes twice as many books as the Netherlands. However, in visualizations, Germany appears similar to the Netherlands, while Spain and Mexico do not aligned with the high-level labels [3].
[1] https://annas-archive.li/datasets
[2] https://internationalpublishers.org/wp-content/uploads/2023/...
[3] https://software.annas-archive.li/AnnaArchivist/annas-archiv...
No comments yet
Contribute on Hacker News ↗