← Back to context

Comment by stavros

18 hours ago

Wait, can you actually just use IP? Can I just make up a packet and send it to a host across the Internet? I'd think that all the intermediate routers would want to have an opinion about my packet, caring, at the very least, that it's either TCP or UDP.

You can definitely craft an IP packet by hand and send it. If it's IPv4, you need to put a number between 0 and 255 to the protocol field from this list: https://www.iana.org/assignments/protocol-numbers/protocol-n...

Core routers don't inspect that field, NAT/ISP boxes can. I believe that with two suitable dedicated linux servers it is very possible to send and receive single custom IP packet between them even using 253 or 254 (= Use for experimentation and testing [RFC3692]) as the protocol number

  • > If it's IPv4, you need to put a number between 0 and 255 to the protocol field from this list:

    To save a skim (though it's an interesting list!), protocol codes 253 and 254 are suitable "for experimentation and testing".

  • What happens when the remaining 104 unassigned protocol numbers are exhausted?

    • We're about half-way to exhausted, but a huge chunk of the ones assigned are long deprecated and/or proprietary technologies and could conceivably be reassigned. Assignment now is obviously a lot more conservative than it was in the 1980s.

      There is sometimes drama with it, though. Awhile back, the OpenBSD guys created CARP as a fully open source router failover protocol, but couldn't get an official IP number and ended up using the same one as VRRP. There's also a lot of historical animosity that some companies got numbers for proprietary protocols (eg Cisco got one for its then-proprietary EIGRP).

      https://en.wikipedia.org/wiki/List_of_IP_protocol_numbers

    • Probably use of some type of options. Up to 320 bits, so I think there is reasonable amount of space there for good while. Ofc, this makes really messy processing, but with current hardware not impossible.

    • People will start overloading the numbers.

      I do hope we'll have stopped using IPv4 by then... But well, a decade after address exhaustion we are still on it, so who knows?

      4 replies →

  • Playing with protocol number change usually results in “Protocol Unreachable” or “Malformed Packet” from your OS.

  • This is an interesting list; it makes you appreciate just how many obscure protocols have died out in practice. Evolution in networks seems to mimic evolution in nature quite well.

> caring, at the very least, that it's either TCP or UDP.

You left out ICMP, my favourite! (And a lot more important in IPv6 than in v4.)

Another pretty well known protocol that is neither TCP nor UDP is IPsec. (Which is really two new IP protocols.) People really did design proper IP protocols still in the 90s.

> Can I just make up a packet and send it to a host across the Internet?

You should be able to. But if you are on a corporate network with a really strict firewalling router that only forwards traffic it likes, then likely not. There are also really crappy home routers which gives similar problems from the other end of enterpriseness.

NAT also destroyed much of the end-to-end principle. If you don't have a real IP address and relies on a NAT router to forward your data, it needs to be in a protocol the router recognizes.

Anyway, for the past two decades people have grown tired of that and just piles hacks on top of TCP or UDP instead. That's sad. Or who am I kidding? Really it's on top of HTTP. HTTP will likely live on long past anything IP.

  • There is little point in inventing new protocols, given how low the overhead of UDP is. That's just 8 bytes per packet, and it enables going through NAT. Why come up with a new transport layer protocol, when you can just use UDP framing?

    • Agreed. Building a custom protocol seems “hard” to many folks who are doing it without any fear on top of HTTP. The wild shenanigans I’ve seen with headers, query params and JSON make me laugh a little. Everything as text is _actually_ hard.

      A part of the problem with UDP is the lack of good platforms and tooling. Examples as well. I’m trying to help with that, but it’s an uphill battle for sure.

  • > NAT also destroyed much of the end-to-end principle. If you don't have a real IP address and relies on a NAT router to forward your data, it needs to be in a protocol the router recognizes.

    Not necessarily. Many protocols can survive being NATed if they don't carry IP/port related information inside their payload. FTP is a famous counterexample - it uses a control channel (TCP21) which contains commands to open data channels (TCP20), and those commands specify IP:port pairs, so, depending on the protocol, a NAT router has to rewrite them and/or open ports dynamically and/or create NAT entries on the fly. A lot of other stuff has no need for that and will happily go through without any rewriting.

    • I think we agree. Of course a NAT router with an application proxy such as FTP or SIP can relay and rewrite traffic as needed.

      TCP and UDP have port numbers that the NAT software can extract and keep state tables for, so we can send the return traffic to its intended destination.

      For unknown IP protocols that is not possible. It may at best act like network diode, which is one way of violating the end-to-end principle.

      2 replies →

    • Of course NAT allows application layer protocols layered on TCP or UDP to pass through without the NAT understanding the application layer – otherwise, NATted networks would be entirely broken.

      The end-to-end principle at the IP layer (i.e. having the IP forwarding layer be agnostic to the transport layer protocols above it) is still violated.

      2 replies →

  • > You left out ICMP, my favourite!

    Even ICMP has a hard time traversing NATs and firewalls these days, for largely bad reasons. Try pinging anything in AWS, for example...

    • Have to say that I don't encounter any problems pinging hosts in AWS.

      If any host is firewalling out ICMP then it won't be pingable but that does not depend on the hosting provider. AWS is no better or worse than any other in that regard, IME.

If there's no form of NAT or transport later processing along your path between endpoints you shouldn't have an issue. But NAT and transport and application layer load balancing are very common on the net these days so YMMV.

You might have more luck with an IPv6 packet.

> I'd think that all the intermediate routers would want to have an opinion about my packet, caring, at the very least, that it's either TCP or UDP.

They absolutely don't. Routers are layer 3 devices; TCP & UDP are layer 4. The only impact is that the ECMP flow hashes will have less entropy, but that's purely an optimization thing.

Note TCP, UDP and ICMP are nowhere near all the protocols you'll commonly see on the internet — at minimum, SCTP, GRE, L2TP and ESP are reasonably widespread (even a tiny fraction of traffic is still a giant number considering internet scales).

You can send whatever protocol number with whatever contents your heart desires. Whether the other end will do anything useful with it is another question.

  • > They absolutely don't. Routers are layer 3 devices;

    Idealized routers are, yes.

    Actual IP paths these days usually involve at least one NAT, and these will absolutely throw away anything other than TCP, UDP, and if you're lucky ICMP.

    • See nearby comment about terminology. Either we're discussing odd IP protocols, then the devices you're describing aren't just "routers" (and particularly what you're describing is not part of a "router"), or we're not discussing IP protocols, then we're not having this thread.

      And note the GP talked about "intermediate routers". That's the ones in a telco service site or datacenter by my book.

As far as I'm aware, sure you can. TCP packets and UDP datagrams are wrapped in IP datagrams, and it's the job of an IP network to ship your data from point A (sender) to point B (receiver). Nodes along the way might do so-called "deep packet inspection" to snoop on the payload of your IP datagrams (for various reasons, not all nefarious), but they don't need to do that to do the basic job of routing. From a semantic standpoint, the information in the TCP and UDP headers (as part of the IP payload) is only there to govern interactions between the two endpoint parties. (For instance, the "port" of a TCP or UDP packet is a node-local identifier for one of many services that might exist at the IP address the packet was routed to, allowing many services to coexist at the same node.)

  • Hmm, I thought intermediate routers use the TCP packet's bits for congestion control, no? Though I guess they can probably just use the destination IP for that.

    • Most intermediate routers don't care much. Lookup the destination IP in the routing table, forward to the next hop, no time for anything else.

      Classic congestion control is done on the sender alone. The router's job is simply to drop packets when the queue is too large.

      Maybe the router supports ECN, so if there's a queue going to the next hop, it will look for protocol specific ECN headers to manipulate.

      Some network elements do more than the usual routing work. A traffic shaper might have per-user queues with outbound bandwidth limits. A network accelerator may effectively reterminate TCP in hopes of increasing acheivable bandwidth.

      Often, the router has an aggregated connection to the next hop, so it'll use a hash on the addresses in the packet to choose which of the underlying connections to use. That hash could be based on many things, but it's not uncommon to use tcp or udp port numbers if available. This can also be used to chose between equally scored next hops and that's why you often see several different paths during a traceroute. Using port numbers is helpful to balance connections from IP A to IP B over multiple links. If you us an unknown protocol, even if it is multiplexed into ports or similar (like tcp and udp), the different streams will likely always hash onto the same link and you won't be able to exceed the bandwidth of a single link and a damaged or congested link will affect all or none of your connections.

    • They probably can do deep/shallow packet inspection for that purpose (being one of the non-nefarious applications I alluded to), but that's not to say their correct functioning relies on it. Those routers also need to support at least UDP, and UDP provides almost no extra information at that level -- just the source and destination ports (so, perhaps QoS prioritization) and the inner payload's length and checksum (so, perhaps dropping bad packets quickly).

      If middleware decides to do packet inspection, it better make sure that any behavioral differences (relative to not doing any inspection) is strictly an optimization and does not impact the correctness of the link.

      Also, although I'm not a network operator by any stretch, my understanding is that TCP congestion control is primarily a function of the endpoints of the TCP link, not the IP routers along the way. As Wikipedia explains [0]:

      > Per the end-to-end principle, congestion control is largely a function of internet hosts, not the network itself.

      [0]: https://en.wikipedia.org/wiki/TCP_congestion_control

The reason you wouldn't do that is IP doesn't give you a mechanism to share an IP address with multiple processes on a host, it just gets your packets to a particular host.

As soon as you start thinking about having multiple services on a host you end up with the idea of having a service id or "port"

UDP or UDP Lite gives you exactly that at the cost of 8 bytes, so there's no real value in not just putting everything on top of UDP

Yep it's full of IP protocols other than the well-known TCP, UDP and ICMP (and, if you ever had the displeasure of learning IPSEC, its AH and ESP).

A bunch of multicast stuff (IGMP, PIM)

A few routing protocols (OSPF, but notably not BGP which just uses TCP, and (usually) not MPLS which just goes over the wire - it sits at the same layer as IP and not above it)

A few VPN/encapsulation solutions like GRE, IP-in-IP, L2TP and probably others I can't remember

As usual, Wikipedia has got you covered, much better than my own recollection: https://en.wikipedia.org/wiki/List_of_IP_protocol_numbers

  • To GPs point, though, most of these will unfortunately be dropped by most middleboxes for various reasons.

    Behind a NA(P)T, you can obviously only use those protocols that the translator knows how to remap ports for.

You know I've always wondered if you could run Kermit*-over-IP, without having TCP inbetween.

*The protocol.

They shouldn't; the whole point is that the IP header is enough to route packets between endpoints, and only the endpoints should care about any higher layer protocols. But unfortunately some routers do, and if you have NAT then the NAT device needs to examine the TCP or UDP header to know how to forward those packets.

Probably not, loads of routers are even blocking parts of ICMP.

  • That's firewalls (or others), not routers. If it blocks things, it's by definition not a router anymore.

    • You can call the things mangling IP addresses and TCP/UDP ports what you want, but that will unfortunately not make them go away and stop throwing away non-TCP/UDP traffic.

    • Both things come on the same box nowadays.

      There are many routers that don't care at all about what's going through them. But there aren't any firewalls that don't route anymore (not even at the endpoints).