Comment by Filligree
8 years ago
One useful trick is that, for gzip, d(z(x+y)) = d(z(x) + z(y)).
So you don't need to compress the entire terabyte.
8 years ago
One useful trick is that, for gzip, d(z(x+y)) = d(z(x) + z(y)).
So you don't need to compress the entire terabyte.
I'd expect that to provide a lower compression, though it may not matter given the additional followup gzips.
The compression finally finished after 3h (on an old MBP), "dd if=/dev/zero bs=1m count=1m | gzip | gzip | gzip" yields a bit under 10k (10082 bytes), and adding a 4th gzip yields a bit under 4k (4004 bytes). The 5th gzip starts increasing the size of the archive.
It does, though I once used that trick to create a file containing more "Hello, World" lines than there are atoms in the universe. By, hmm, quite a large factor. It probably isn't a serious concern.
It still fit on a floppy disk. :)
That's true for the content stream but not gzip files themselves, which do have a minimal header.
Which gunzip will overlook / handle correctly, so concatenating the compressed files does work.