Comment by jeffbee
7 hours ago
There definitely are bottlenecks. The one I always think of is the kernel's networking stack. There's no sense in using the kernel TCP stack when you have hundreds of independent workloads. That doesn't make any more sense than it would have made 20 years ago to have an external TCP appliance at the top of your rack. Userspace protocol stacks win.
Do the partitioned stacks of network namespaces share a single underlying global stack or are they fully independent instances? (And if not, could they be made so?)
Usually network namespaces are linked together with a single bridge so you can get lock contention there.
If you have a separate physical NIC for each namespace you probably won't have any contention.
I think you could get much of the way there by isolating a single NIC's receive queues, so the kernel doesn't decide to run off and service softirqs for random foreign tasks just because your task called tcp_sendmsg.
io_uring?
If anything, uring makes the problem much worse by reducing the cost of one process flooding kernel internals in a single syscall.