Comment by spockz
15 hours ago
I’m not sure why, but just like with scp, I’ve achieved significant speeds ups by tarring the directory first (optionally compressing it), transferring and then decompressing. Maybe because it makes the tar and submit, and the receive, untar/uncompress, happen on different threads?
One of my "goto" tools is copying files over a "tar pipe". This avoids the temporary tar file. Something like:
I've never verified this, but it feels like scp starts a new TCP connection per file. If that's the case, then scp-ing a tarred directory would be faster because you only hit the slow start once. https://www.rfc-editor.org/rfc/rfc5681#section-3.1
Also handy to note that tar can handle sparse files, whereas scp doesn't.
It's typically a disk-latency thing, as just stat-ing the many files in a directory can have significant latency implications (especially on spinning HDDs) vs opening a single file (the tar) and read-()ing that one file in memory before writing to the network.
If copying a folder with many files is slower than tarring that folder and the moving the tar (but not counting the untar) then disk latency is your bottleneck.
Not useful very often, but fast and kind of cool: You can also just netcat the whole block device if you wanted a full filesystem copy anyway. Optionally zero all empty space before using a tool like zerofree and use on-the-fly compression / decompression with lz4 or lzo. Of course, none of the block devices should be mounted, though you could probably get away with a source that's mounted read-only.
dd is not a magic tool that can deal with block devices while others can't. You can just cp myLinuxInstallDisk.iso to /dev/myUsbDrive, too.
Okay. In this case the whole operation is faster end to end. That includes the time it takes to tar and untar. Maybe those programs do something more efficient in disk access than scp and rsync?