← Back to context

Comment by cantrecallmypwd

4 months ago

History lesson for the cheap seats in the back:

Compressed tars are terrible for random access because the compression occurs after the concatenation and so knows nothing about inner file metadata, but it's good for streaming and backups. Uncompressed tars are much better for random access. (Tar was a used as a backup mechanism to tape (tape archive).)

Zips are terrible for streaming because their metadata is stored at the end, but are better for 1-pass creation and on-disk random access. (Remember that zip files and programs were created in an era of multiple floppy disk-based backups.)

When fast tar enumeration is desired, at the cost of compatibility and compression potential, it might be worth compressing files and then taring them when and if zipping alone isn't achieving enough compression and/or decompression performance. FUSE compressed tar mounting gets to be really expensive with terabyte archives.