Comment by rsecora

1 year ago

The dataset consists of books from the Anna Archive, each identified by an ISBN. The ISBNs and titles are extracted from datasets [1], which include magazines and books primarily in Chinese, English, and French.

Example: Germany publishes five times more books than the Netherlands [2], and Spain publishes twice as many books as the Netherlands. However, in visualizations, Germany appears similar to the Netherlands, while Spain and Mexico do not aligned with the high-level labels [3].

[1] https://annas-archive.li/datasets

[2] https://internationalpublishers.org/wp-content/uploads/2023/...

[3] https://software.annas-archive.li/AnnaArchivist/annas-archiv...

0 comments

rsecora

No comments yet

Contribute on Hacker News ↗