Comment by chasil
2 days ago
The normal answer that I have heard to the performance problems in the conversion from scp to sftp is to use rsync.
The design of sftp is such that it cannot exploit "TCP sliding windows" to maximize bandwidth on high-latency connections. Thus, the migration from scp to sftp has involved a performance loss, which is well-known.
https://daniel.haxx.se/blog/2010/12/08/making-sftp-transfers...
The rsync question is not a workable answer, as OpenBSD has reimplemented the rsync protocol in a new codebase:
An attempt to combine the BSD-licensed rsync with OpenSSH would likely see it stripped out of GPL-focused implementations, where the original GPL release has long standing.
It would be more straightforward to design a new SFTP implementation that implements sliding windows.
I understand (but have not measured) that forcibly reverting to the original scp protocol will also raise performance in high-latency conditions. This does introduce an attack surface, should not be the default transfer tool, and demands thoughtful care.
I included LFTP using mirror+sftp in my example as it is the secure way to give less than trusted people access to files and one can work around the lack of sliding windows by spawning as many TCP flows as one wishes with LFTP. I would love to see SFTP evolve to use sliding windows but for now using it in the data-center or over WAN accelerated links is still fast.
Rsync is great when moving files between trusted systems that one has a shell on but the downside is that rsync can not split up files into multiple streams so there is still a limit based on source+dest buffer+rtt and one has to provide people a shell or add some clunky way to prevent a shell by using wrappers unless using native rsync port 873 which is not encrypted. Some people break up jobs on the client side and spawn multiple rsync jobs in the background. It appears that openrsync is still very much work in progress.
SCP is being or has been deprecated but the binaries still exist for now. People will have to hold onto old binaries and should probably static compile them as the linked libraries will likely go away at some point.
The scp program switched to calling sftp as the server in OpenSSH version 8.9, and notably Windows is now running 9.5, so large segments of scp users are now invoking sftp behind the scenes.
If you want to use the historic scp server instead, a command line option is provided to allow this:
"In case of incompatibility, the scp(1) client may be instructed to use the legacy scp/rcp using the -O flag."
https://www.openssh.org/releasenotes.html
The old scp behavior hasn't been removed, but you need to specifically request it. It is not the default.
It would seem to me that an alternate invocation for file transfer could be tested against sftp in high latency situations:
That would be slightly faster than tar, which adds some overhead. Using tar on both sides would allow transfers of special files, soft links, and retain hard links, which neither scp nor sftp will do.
Windows has also recently added a tar command.
Keep in mind that SCP/SSH might be faster in some cases than SFTP but in both cases it is still limited to a 2MB application layer receive window which is drastically undersized in a lot of situations. It doesn't matter what the TCP window is set to because the OpenSSH window overrides that value. Basically, if your bandwidth delay product is more than 2MB (e.g. 1gbps @ 17ms RTT) you're going to be application limited by OpenSSH. HPN-SSH gets most of the performance benefit by normalizing the application layer receive window to the TCP receive window (up to 128MB). In some cases you'll see 100X throughput improvement on well tuned hosts on a high delay path.
If your BDP is less than 2MB you still might get some benefit if you are CPU limited and use the parallel ciphers. However, the fastest cipher is AES-GCM and we haven't parallelized that as of yet (that's next on the list).
2 replies →
Rsync commonly uses SSH as the transport layer so it won't necessarily be any faster than SFTP unless you are using the rsync daemon (usually on port 873). However, the rsync daemon won't provide any encryption and I can't suggest using it unless it's on a private network.
> The rsync question is not a workable answer, as OpenBSD has reimplemented the rsync protocol in a new codebase
I thought openrsync existed solely because of rpki. Even OpenBSD devs recommend using the real version from ports.