← Back to context

Comment by metadaemon

2 years ago

I'd just love a protocol that has a built in mechanism for realizing the other side of the pipe disconnected for any reason.

That's possible in circuit switched networking with various types of supervision, but packet switched networking has taken over because it's much less expensive to implement.

Attempts to add connection monitoring usually make things worse --- if you need to reroute a cable, and one or both ends of the cable will detect a cable disconnection and close user sockets, that's not great, now you do a quick change with a small period of data loss but otherwise minor interruption; all of the established connections will be dropped.

To re-word everyone else's comments - "Disconnected" is not well-defined in any network.

  • > To re-word everyone else's comments - "Disconnected" is not well-defined in any network.

    Parent said disconnected pipe, not network. It's sufficiently well-definable there.

    • I think it's a distinction without a difference in this case. You can't know if the reason your water stopped is because the water is shut off, the pipe broke, or it's just slow.

      When all you have to go on is "I stopped getting packets" the best you can do is give up after a bit. TCP keepsalives do kinda suck and are full of interesting choices that don't seem to have passed the test of time. But they are there and if you control both sides of the connection you can be sure they work.

      9 replies →

That's really really hard. For a full, guaranteed way to do this we'd need circuit switching (or circuit switching emulation). It's pretty expensive to do in packet networks - each flow would need to be tracked by each middle box, so a lot more RAM at every hop, and probably a lot more processing power. If we go with circuit establishment, its also kind of expensive and breaks the whole "distributed, decentralized, self-healing network" property of the Internet.

It's possible to do better than TCP these days, bandwidth is much much less constrained than it was when TCP was designed, but it's still a hard problem to do detection of pipe disconnected for any reason other than timeouts (which we already have).

Several of the "reliable UDP" protocols I have worked on in the past have had a heartbeat mechanism that is specifically for detecting this. If you haven't sent a packet down the wire in 10-100 milliseconds, you will send an extra packet just to say you're still there.

It's very useful to do this in intra-datacenter protocols.

These types of keepalives are usually best handled at the application protocol layer where you can design in more knobs and respond in different ways. Otherwise you may see unexpected interactions between different keepalive mechanisms in different parts of the protocol stack.

Like TCP keepalives?

  • If the feature already technically exists in TCP, it's either broken or disabled by default, which is pretty much the same as not having it.

    • keepalives are an optional TCP feature so they are not necessarily supported by all TCP implementations and therefor default to off even when supported.

      6 replies →

    • You're conflating all optional TCP features of all operating systems, network devices, and RFCs together. This lack of nuance fails to appreciate that different applications have different needs for how they use TCP: ( server | client ) x ( one way | chatty bidirectional | idle tinygram | mixed ). If a feature needs to be used on a particular connection, then use it. ;)

If a socket is closed properly there'll be a FIN and the other side can learn about it by polling the socket.

If the network connection is lost due to external circumstances (say your modem crashes) then how would that information propagate from the point of failure to the remote end on an idle connection? Either you actively probe (keepalives) and risk false positives or you wait until you hear again from the other side, risking false negatives.

  • It gets even worse - routing changes causing traffic to blackhole would still be undetectable without a timeout mechanism, since probes and responses would be lost.

  • > If the network connection is lost due to external circumstances (say your modem crashes) then how would that information propagate from the point of failure to the remote end on an idle connection?

    Observe the line voltage? If it gets cut then you have a problem...

    > Either you actively probe (keepalives) and risk false positives

    What false positives? Are you thinking there's an adversary on the other side?

    • This is a L2 vs L3 thing.

      Most network links absolutely will detect that the link has gone away; the little LED will turn off and the OS will be informed on both ends of that link.

      But one of the link ends is a router, and these are (except for NAT) stateless. The router does not know what TCP connections are currently running through it, so it cannot notify them - until a packet for that link arrives, at which point it can send back an ICMP packet.

      A TCP link with no traffic on it does not exist on the intermediate routers.

      (Direct contrast to the old telecom ATM protocol, which was circuit switched and required "reservation" of a full set of end-to-end links).

    • For a given connection, (most) packages might go through (e.g.) 10 links. If one link goes down (or is saturated and dropping packets) the connection is supposed to route around it.

      So, except for the links on either of end going down (one end really, if the other is on a “data center” the TCP connection is likely terminated in a “server” with redundant networking) you wouldn't want to have a connection terminated just because a link died.

      That's explicitly against the goal of a packed switched network.