← Back to context

Comment by bityard

1 day ago

I wonder if someone might indulge me by answering a question or two about Tailscale. I have a self-managed wireguard network which works, but probably isn't very smart or elegant.

From what I can gather, Tailscale does a lot of "magic" things to accomplish its goals, and some of them actually have "magic" right in the name. As a system administrator by trade, I have been bitten SO MANY TIMES by things that try to automagically mess with DNS resolution, routing tables, firewall rules, etc in the name of user-friendliness. (Often, things that even ship with the OS itself.)

Are there any documentation or articles detailing exactly what it's doing under the hood? I found https://tailscale.com/docs/concepts but it doesn't really cover everything.

If I have a virtualization host with, let's call it a "very custom" networking configuration, how likely is it to interfere with things? Is it polite and smart about working around fancy networking setups, or does it really only handle the common cases (one networking interface, a default route, public nameserver) elegantly?

It's difficult for us to maintain documentation of exactly the kind you'd want there, though we do try to keep up with docs as best we can. In particular there is a fairly wide array of heuristics in the client to adapt to the environment that it's running in - and this is most true on Linux where there are far far too many different configuration patterns and duplicate subsystems (example: https://tailscale.com/blog/sisyphean-dns-client-linux).

To try and take a general poke at the question in more of the context you leave at the end:

- We use rule based routing to try to dodge arbitrary order conflicts in the routing tables.

- We install our rules with high priority because traffic intended for the tailnet hitting non-tailscale interfaces is typically undesirable (it's often plain text).

- We integrate with systemd-resolved _by preference_ on Linux if it is present, so that if you're using cgroup/namepsace features (containers, sandbox runtimes, etc etc) then this provides the expected dns/interface pairings. If we can't find systemd-resolved we fall back to modifying /etc/resolv.conf, which is unavoidably an area of conflict on such systems (on macos and windows they have more broadly standard solutions we can use instead, modulo other platform details).

- We support integration with both iptables and nftables (the latter is behind manual configuration currently due to slightly less broad standardization, but is defaulted by heuristic on some distros/in some environments (like gokrazy, some containers)). In nftables we create our own tables, and just install jumps into the xtables conventional locations so as to be compatible with ufw, firewalld and so on.

- We do our best in tailscaled's sshd to implement login in a broadly compatible way, but again this is another of those places the linux ecosystem lacks standards and there's a ton of distro variation right now (freedesktops concerns start at a higher level so they haven't driven standardization, everyone else like openssh have their own pile of best-guesses, and distros go ham with patches).

- We need a 1360 byte MTU path to peers for full support/stability. Our inner/interface MTU is 1280, the minimum MTU for IPv6, once packed in WireGuard and outer IPv6, that's 1360.

I can't answer directly based on "very custom" if there will be any challenges to deal with. We do offer support to work through these things though, and have helped some users with fairly exotic setups.

  • > It's difficult for us to maintain documentation of exactly the kind you'd want there

    Suggestion: let an LLM maintain it for you.

    Alternate suggestion for OP: let an LLM generate the explanations you want from the code (when available).

    • This problem space is not small enough to stay within current LLM attention span. A sufficiently good agent setup might be able to help maintain docs somewhat through changes, but organizing them in an approachable way covering all the heuristics spread across so many places and external systems with a huge amount of time and versioning multivariate factors is hugely troublesome for current LLM capabilities. They're better at simpler problems, like typing the code.

I believe the client is open source and there's a reverse engineered server (that some tail scale employees contribute to)