Comment by DannyBee

2 years ago

Hashing is literally compression (and is considered so), it's just not easily decompressable.

It's also often lossy, though it depends on your input space.

For example, if you accept all text of all length, then it's lossy for sure. But if you were to accept something like "the text of any book", then it's easy to make it non-lossy.

There's only like 140-150 million books in the world, at a rough estimate, so you could easily losslessly compress all the existing books in the world to a few bytes. Even if you multiply to all variants and translations, it would still likely hash to less than 100 bytes. But you still have to store the table of books somewhere at least once, and it wouldn't be able to compress new books :P