← Back to context

Comment by Philip-J-Fry

3 years ago

In my opinion, I think it's correct to be disabled by default.

I think Nagle's algorithm does more harm than good if you're unaware of it. I've seen people writing C# applications and wondering why stuff is taking 200ms. Some people don't even realise it's Nagle's algorithm (edit: interacting with Delayed ACKs) and think it's network issues or a performance problem they're introduced.

I'd imagine most Go software is deployed in datacentres where the network is high quality and it doesn't really matter too much. Fast data transfer is probably preferred. I think Nagle's algorithm should be an optimisation you can optionally enable (which you can) to more efficiently use the network at the expense of latency. Being more "raw" seems like the sensible default to me.

The basic problem, as I've written before[1][2], is that, after I put in Nagle's algorithm, Berkeley put in delayed ACKs. Delayed ACKs delay sending an empty ACK packet for a short, fixed period based on human typing speed, maybe 100ms. This was a hack Berkeley put in to handle large numbers of dumb terminals going in to time-sharing computers using terminal to Ethernet concentrators. Without delayed ACKs, each keystroke sent a datagram with one payload byte, and got a datagram back with no payload, just an ACK, followed shortly thereafter by a datagram with one echoed character. So they got a 30% load reduction for their TELNET application.

Both of those algorithms should never be on at the same time. But they usually are.

Linux has a socket option, TCP_QUICKACK, to turn off delayed ACKs. But it's very strange. The documentation is kind of vague, but apparently you have to re-enable it regularly.[3]

Sigh.

[1] https://stackoverflow.com/questions/46587168/when-during-the...

  • Gotta love HN. The man himself shows up to explain.

  • > The documentation is kind of vague, but apparently you have to re-enable it regularly.[3]

    This is correct. And in the end it means more or less that setting the socket option is more of a way of sending an explicit ACK from userspace than a real setting.

    It's not great for common use-cases, because making userspace care about ACKs will obviously degrade efficiency (more syscalls).

    However it can make sense for some use-cases. E.g. I saw the s2n TLS library using QUICKACK to avoid the TLS handshake being stuck [1]. Maybe also worthwhile to be set in some specific RPC scenarios where the server might not immediately send a response on receiving the request, and where the client could send additional frames (e.g. gRPC client side streaming, or in pipelined HTTP requests if the server would really process those in parallel and not just let them sit in socket buffers).

    [1] https://github.com/aws/s2n-tls/blob/46c47a71e637cabc312ce843...

  • Any kernel engineer reading that can explain why TCP_QUICKACK isn't enabled by default? Maybe it's time to turn it on by default, if it was just a workaround for old terminals.

    • Enabling it will lead to more ACK packets being sent, which leads to lower efficiency of TCP (the stack spends time in processing ACK packets) and lower link utilization (these packets also need space somewhere).

      My thought is that the behavior is probably correct by default, since a receiver without knowledge of the application protocol is not able to know whether follow-up data will immediately, and therefore not able to decide whether it should send an ACK or wait for more data. It could wait for a signal from userspace to send that ACK - which is exactly what QUICKACK is doing - but that comes with the drawback of now needing an extra syscall per read.

      On the sender side the problem seems solvable more efficiently. If one aggregates data in the application, and just sends as everything at once using an explicit flush signal (either using CORKing APIs or enabling TCP_NODELAY), no extra syscall is required while minimal latency can be maintained.

      However I think it might be a good question on whether the delayed ACK periods are still the best choices for the modern internet, or whether much smaller delays (e.g. 5ms, or something along a fraction of the RTT) could be helpful.

  • Thanks for this reply. What I find specially annoying is that the TCP client and the servers starts by a synchronization round-trip which is supposed to be used to synchronise options and this isn't the case here! Why can't the client and the servers agree on a sensible set of options (no delayed ack if the client is using the Nagle algorithm)??

  • TCP_QUICKACK is mostly used to send initial data along with the first ACK upon establishing a connection, or to make sure to merge the FIN with the last segment.

  • How it's possible that delayed acks and nagle's algorithms are both defaults, anywhere? Isn't this a matter of choosing one, or another?

  • Did the move from line oriented input to character input also occur around then?

    I remember as a student, vi was installed and we all went from using ed to vi.

    There was much gnashing and wailing from the admins of the VAX.

    • 1984 would have been largely character if desired -- you already had desktop PCs with joystick and mouse too. The problem was the original party-line ethernet with large numbers of telnet clients or some other [nonstop, nonburst] byte-oriented protocol or serial hardware concentrator, which was a universal situation at educational institutions of the mid-to-late eighties. The Berkeley hack referred to above likely boosted the number of clients you could run on one ethernet sub with acceptable responsiveness.

From the bottom of the article:

> Most people turn to TCP_NODELAY because of the “200ms” latency you might incur on a connection. Fun fact, this doesn’t come from Nagle’s algorithm, but from Delayed ACKs or Corking. Yet people turn off Nagle’s algorithm … :sigh:

  • Yeah but Nagle's Algorithm and Delayed ACKs interaction is what causes the 200ms.

    Servers tend to enable Nagle's algorithm by default. Clients tend to enabled Delayed ACK by default, and then you get this horrible interaction all because they're trying to be more efficient but stalling eachother.

    I think Go's behavior is the right default because you can't control every server. But if Nagle's was off by default on servers then we wouldn't need to disabled Delayed ACKs on clients.

    • Clients having delayed acks has a very good reason: those ACKs cost data, and clients tend to have much higher download bandwidth than upload bandwidth. Really, clients should probably be delaying acks and nagling packets, while servers should probably be doing neither.

      3 replies →

    • "Be conservative in what you send and liberal in what you accept"

      I would cite Postels Law: Nagle's is the "conservative send" side. An ACK is a signal of acceptance, and should be issued more liberally (even though it's also sent, I guess).

Agreed. The post should be titled 'Go enables TCP_NODELAY by default', and a body may or may not even be needed. It's documented, even https://pkg.go.dev/net#TCPConn.SetNoDelay

To know why would be interesting, I guess. But you should be buffering writes anyways in most cases. And if you refuse to do that, just turn it back off on the socket. This is on the code author.

> I'd imagine most Go software is deployed in datacentres where the network is high quality

The problem is that those datacenters are plugged into the Internet, where the network is not always high quality. TFA mentions the Caddy webserver - this is "datacenter" software designed to talk to diverse clients all over the internet. The stdlib should not tamper with the OS defaults unless the OS defaults are pathological.

  • That doesn't make much sense. There are all sorts of socket and file descriptor parameters with defaults that are situational; NDELAY is one of them, as is buffer size, nonblockingness, address reuse, &c. Maybe disabling Nagle is a bad default, maybe it isn't, but the appeal to "OS defaults" is a red herring.

  • Also, for small packets, disabling consolidation means adding LOTS of packet overhead. You're not sending 1 million * 50 bytes of data, you're sending 1 million * (50 bytes of data + about 80 bytes of TCP+ethernet header).

    Disabling Nagle makes sense for tiny request/replys (like RPC calls) but it's counterproductive for bulk transfers.

    I'm not the only one who don't like the thought of a standard library quietly changing standard system behaviour ... so know I have to know the standard routines and their behaviour AND I have to know which platforms/libraries silently reverse things :(