← Back to context

Comment by withinboredom

3 years ago

I prefer reliability over latency, always. The world won’t fall apart in 200ms, let alone 40ms. If you’re doing something where latency does matter (like stocks) then you probably shouldn’t be using TCP, honestly (avoid the handshake!)

When it comes to code, readability and maintainability are more important. If your code is reading chunks of a file then sending it to a packet, you won’t know the MTU or changes to the MTU along the path. Send your chunk and let Nagle optimize it.

Further, principle of least surprise always applies. The OS default is for Nagle to be enabled. For a language to choose a different default (without providing a reason), and one that actively is harmful in poor network conditions at that, was truly surprising.

TCP is always reliable, the choice of this algorithm will never impact this - it will only impact performance (bandwidth/latency) and efficiency.

Enabling nagle by default will lead to elevated latencies with some protocols that don't require the peer to send a response (and thereby a piggybacked ACK) after each packet. Even a "modern" TLS1.3 0RTT handshake might fall into that category. This is a performance degradation.

The scenario that is described in the blog post where too many small packets due to nothing aggregating them causing elevated packet loss is a different performance degradation, and nothing else.

Both of those can be fixed - the former only by enabling TCP_NODELAY (since the client won't have control over servers), the second by either keeping TCP_NODELAY disabled *or* by aggregating data in userspace (e.g. using a BufferedWriter - which a lot of TLS stacks might integrate by default).

> The world won’t fall apart in 200ms, let alone 40ms.

You might be underestimating the the latency sensitivity of the modern internet. Websites are using CDNs to get to a typical latency in the 20ms range. If this suddenly increases to 40ms, the internet experience of a lot of people might get twice as bad as it is at the moment. 200ms might directly push the average latency into what is currently the P99.9 percentile.

And it would get even worse for intra datacenter use-cases, where the average is in the 1ms range - and where accumulated latencies would still end up being user-experiencable (the latency of any RPC call is the accumulated latency of upstream calls).

> If your code is reading chunks of a file then sending it to a packet, you won’t know the MTU or changes to the MTU along the path

Sure - you don't have to. As mentioned, you would just read into an intermediate application buffer of a reasonable size (definitely bigger than 16kB or 10 MTUs) and let the OS deal with it. A loop along `n = read(socket, buffer); write(socket, buffer[0..n])` will not run into the described issue if the buffer is reasonably sized and will be a lot more CPU efficient than doing tiny syscalls and expecting all aggregation to happen in TCP send buffers.

Much of the world is doing ok with TCP and TLS but with session resumption and long lived connections. Many links will be marked bad in 200 ms and retries or new links issues. Imagine you are doing 20k / second / CPU. That is four thousand backed up calls for no reason, just randomness.

> I prefer reliability over latency, always.

I imagine all the engineers who serve millions/billions of requests per second disagree with adding 200ms to each request, especially since their datacenter networks are reliable.

> Send your chunk and let Nagle optimize it.

Or you could buffer yourself and save dozens/hundreds of expensive syscalls. If adding buffering makes your code unreadable, your code has bigger maintainability problems.

  • I’ve done quite a bit of testing on my shitty network (plus a test bench using Docker and plumba) in the last 24 hours — I’m not finished so take the rest of this with a grain of salt. There will be a blog post about this in the near future… once I finish the analysis.

    Random connection resets are much more likely when disabling Nagle’s algorithm. As in 2-4x more likely, especially with larger payloads. Most devs just see “latency bad” without considering the other benefits of Nagle: you won’t send a packet until you receive an ACK or the packet is full. On poor networks, you always see terrible latency (even with Nagle’s disabled, 200-500ms is the norm) and with Nagle’s the throughput is a bit higher than without, even with proper buffering on the application side.