Comment by mgaunard
1 day ago
"zero syscall"
> In order to avoid busy looping, both the kernel and the web server will only busy-loop checking the queue for a little bit (configurable, but think milliseconds), and if there’s nothing new, the web server will do a syscall to “go to sleep” until something gets added to the queue.
Under load it's zero syscall (barring any rare allocations inside rustls for the handshake. I can't guarantee that it never does).
Without load the overhead of calling (effectively) sleep() is, while technically true, not relevant.
But sure, you can tweak the busyloop timers and burn 100% CPU on kernel and user side indefinitely if you want to avoid that sleep-when-idle syscall. It's just… not a good idea.
In my experience, trying to use io_uring for spinning/no-system-call uses is not straightforward.
First, there are some tricks required to actually make it work at all, then there is a problem that you'll need a core not only for userland, but also inside the kernel, both of them per-application.
Sharing a kernel spinning thread across multiple applications is also possible but requires further efforts (you need to share some parent ring across processes, which need to be related).
Overall I feel that it doesn't really deliver on the no-system-call idea, certainly not out of the box. You might have a more straightforward experience with XDP, which coincidentally gives you a lot more access and control as well if you need it.
It’s good to read an article until the end
> This means that a busy web server can serve all of its queries without even once (after setup is done) needing to do a syscall. As long as queues keep getting added to, strace will show nothing.
Like all polling I/O models (that don't spin) it also means you have to wait milliseconds in the worst case to start servicing a request. That's a long time.
For comparison a read/write over a TCP socket on loopback between two process is a few microseconds using BSD sockets API.
> Like all polling I/O models (that don't spin) it also means you have to wait milliseconds in the worst case to start servicing a request. That's a long time.
No? What they're saying is the busy loop will spin until an event occurs, for at most x ms. And if it does park the thread (the only syscall required), it can be immediately woken up on the first event too. Only if multiple events occurred since the last call would you receive them together. This normally happens only under high load, when event processing takes enough time to have a buildup of new events in the background. Increased latency is the intended outcome on high loads.
To be fair, it was a while ago I read the io-uring paper. But I distinctly recall the mix of poll and park behavior, plus configurable wait conditions. Please correct me if I'm wrong (someone here certainly knows).