← Back to context

Comment by softfalcon

16 hours ago

> It almost always indicates some bottleneck in the application or TCP tuning.

Yeah, this has been my experience with low-overhead streams as well.

Interestingly, I see a ubiquity of this "open more streams to send more data" pattern all over the place for file transfer tooling.

Recent ones that come to mind have been BackBlaze's CLI (B2) and taking a peek at Amazon's SDK for S3 uploads with Wireshark. (What do they know that we don't seem to think we know?)

It seems like they're all doing this? Which is maybe odd, because when I analyse what Plex or Netflix is doing, it's not the same? They do what you're suggesting, tune the application + TCP/UDP stack. Though that could be due to their 1-to-1 streaming use case.

There is overhead somewhere and they're trying to get past it via semi-brute-force methods (in my opinion).

I wonder if there is a serialization or loss handling problem that we could be glossing over here?

Memory and CPU are cheap (up to a point) so why not just copy/paste TCP streams. It neatly fits into multi-processing/threading as well.

When we were doing 100TB backups of storage servers we had a wrapper that run multiple rsyncs over the file system, that got throughput up to about 20gigbits a second over lan

that is a different problem. For S3-esque transfers you might very well be limited by ability for target to receive X MB/s and not more and so starting parallel streams will make it faster.

I used B2 as third leg for our backups and pretty much had to give rclone more connections at once because defaults were nowhere close to saturating bandwidth

Tuning on Linux requires root and is systemwide. I don't think BBR is even available on other systems. And you need to tune the buffer sizes of both ends too. Using multiple streams is just less of a hassle for client users. It can also fool some traffic shaping tools. Internal use is a different story.

not sure about B2 but AWS S3 SDK not assuming that people will do any tuning makes total sense

cuz in my experience no one is doing that tbh

  • I’ve found aws s3 it’s always been painful to get any good speed out of it unless it’s massive files you’re moving.

    It’s base line tuning seems to just assume large files and does no auto scaling and it’s mostly single threaded.

    Then even when tuning it’s still painfully slow, again seemly limited by its cpu processing and mostly on a single thread, highly annoying.

    Especially when you’re running it on a high core, fast storage, large internet connection machine.

    Just feels like there is a large amount of untapped potential in the machines…

    • It’s almost certainly also tuned to prevent excessive or “spiky” traffic to their service.