Comment by goodside
6 years ago
Counted by whom? What benchmark follows the model you’re describing? Does any real-world compressor use dictionaries anywhere near this big?
If you can bring the complete benchmark corpus (or substantial subsets of it) “into the wilderness”, the benchmark isn’t worth running. It’s not a compressor, it’s a database with stable keys. A Library of Congress LCCN code uniquely identifies the complete text of any published book, but it doesn’t contain a compressed copy of that book.
No comments yet
Contribute on Hacker News ↗