Comment by goodside

6 years ago

Counted by whom? What benchmark follows the model you’re describing? Does any real-world compressor use dictionaries anywhere near this big?

If you can bring the complete benchmark corpus (or substantial subsets of it) “into the wilderness”, the benchmark isn’t worth running. It’s not a compressor, it’s a database with stable keys. A Library of Congress LCCN code uniquely identifies the complete text of any published book, but it doesn’t contain a compressed copy of that book.

0 comments

goodside

No comments yet

Contribute on Hacker News ↗