Comment by iscoelho
2 days ago
VXLAN over WireGuard is acceptable if you require a shared L2 boundary.
IPSec over VXLAN is what I recommend if you are doing 10G or above. There is a much higher performance ceiling than WireGuard with IPSec via hardware firewalls. WireGuard is comparatively quite slow performance-wise. Noting Tailscale, since it has been mentioned, has comparatively extremely slow performance.
edit: I'm noticing that a lot of the other replies in this thread are not from network engineers. Among network engineers WireGuard is not very popular due to performance & absence of vendor support. Among software engineers, it is very popular due to ease of use.
> Noting Tailscale, since it has been mentioned, has comparatively extremely slow performance.
Isn't this mainly because Tailscale relies on userspace WG (wireguard-go)? I'd imagine the perf ceiling is much higher for kernel WG, which I believe is what Netbird uses.
wireguard-go is indeed very slow. For example, the official WireGuard Mac client uses it, and performance on my M1 Max is CPU capped at 200Mbps. The kernel WireGuard implementation available for Linux is certainly faster, but I would not consider it fast.
Tailscale however, although it derives from WireGuard libraries and the protocol, is really not WireGuard at all- so comparing it is a bit apples to oranges. With that said, it is still entirely userspace and its performance is less than stellar.
Well, according to this[1] bench, you can get ~10 Gbps with kernel WG.
I'm interested in this because I'm working on a small hobby project to learn eBPF. The idea is to implement a "Tailscale-lite" that eliminates context switches by keeping both Wireguard and L3 and L4 policy handling in kernel space. To me, the bulk of Tailscale's overhead comes from the fact that the dataplane is running between user and kernel space.
[1]: https://github.com/cyyself/wg-bench
2 replies →
How is IPSec performance better than wg? I never heard this before, it sounds intriguing.
At this time, there is no commercial offering for hardware/ASIC WireGuard implementations. The standard WireGuard implementation cannot reach 10G.
The fastest I am aware of is VPP (open-source) & Intel QAT [1], which while it is achieves impressive numbers for large packets (70Gbps @ 512 / 200Gbps @ 1420 on a $20k+ MSRP server), is still not comparable with commercial IPsec offerings [2][3][4] that can achieve 800Gbps+ on a single gateway (and come with the added benefit of relying on a commercial product with support).
[1] https://builders.intel.com/docs/networkbuilders/intel-qat-ac...
[2] https://www.juniper.net/content/dam/www/assets/datasheets/us...
[3] https://www.paloaltonetworks.com/apps/pan/public/downloadRes...
[4] https://www.fortinet.com/content/dam/fortinet/assets/data-sh...
There are also solutions like Arista TunnelSec [1] that can achieve IPsec and VXLANsec at line-rate performance (21.6Tbps per chassis)! This is fairly new and fancy though.
[1] https://www.arista.com/assets/data/pdf/Whitepapers/EVPN-Data...
This lack of ASIC is interesting to me. If it existed, that would very much change the game. And, given the simplicity of WG encryption it would be a comparatively small design (lower cost?)
If you have an edge device which implements hardware IPsec at 10g+ but pushes WireGuard to software on an underpowered cpu then sure.
While that's true, I'm not sure it's because of something inherent in IPsec vs WireGuard. It's more likely due to the fact that hardware accelerators have been designed to offload encryption routines that IPsec uses.
One wonders what WG perf would look like if it could leverage the same hardware offload.
4 replies →