Comment by goodside
6 years ago
In theory, yes, but a constant 5 GB penalty is enormous in practice — orders of magnitude bigger than anything used in the real world. Brotli’s static dictionary is only 122 KB, and covers many natural and programming languages beyond just English.
> a constant 5 GB penalty is enormous in practice > orders of magnitude bigger than anything used in the real world
And this is a tiny personal mailserver. There's loads of applications where a 5GB penalty* is well below the amount of text you're looking at (wikipedia springs to mind since they're in the same kind of size range for text.)
Obviously bodies of text bigger than 5GB exist. I was talking about static compressor dictionaries, which are tiny. Hence mentioning Brotli’s 122KB dictionary. Static dictionaries are an optimization to improve the compression of very small text files — they aren’t useful for compressing large files, because once you have lots of data you can build a more efficient dictionary at compression time and include it in the compressed stream.
Not to mention the hardware inefficiencies of a 5 GB dictionary on naive hardware. Poor caches. :(