← Back to context

Comment by hayd

3 days ago

Is this something likely to ever change?

I believe it's possible, but that it's a hard problem requiring great effort. I believe this is a opportunity to apply formal methods ah la seL4, that nothing less will be sufficient, and that the value of io_uring is great enough to justify it. That will take a lot of talent and hours.

I admire io_uring. I appreciate the fact that it exists and continues despite the security problems; evidence that security "concerns" don't (yet) have a veto over all things Linux. The design isn't novel. High performance hardware (NICs, HBAs, codecs, etc.) have used similar techniques for a long time. Io_uring only brings this to user space and generalizes it. I imagine an OS and hardware that fully inculcate the pattern, obviating the need for context switches, interrupts, blocking and other conventional approaches we've slouched into since the inception of computing.

  • Alternatively, it requires cloud providers and such losing business if they refuse to support the latest features.

    The "surface area" argument against io_uring can apply to literally any innovation. Over on LWN, there's an article on path traversal difficulties that mentions people how, because openat2(2) is often banned as inconvenient to whitelist using seccomp, eople have to work around path traversal bugs using fiddly, manual, and slow element-by-element path traversal in user space.

    Ridiculous security theater. A new system call had a vulnerability in 2010 and so we're never able to take practical advantage of new kernel features ever?

    (It doesn't help that gvisor refuses to acknowledge the modern world.)

    Great example of descending into a shitty equilibrium because the great costs of a bad policy are diffuse but the slight benefits are concentrated.

    The only effective lever is commercial pressure. All the formal methods in the world won't help when the incentive structure reinforces technical obstinacy.

It already did with the io_uring worker rewrite in 5.12 (2021) which made it much safer.

https://github.com/axboe/liburing/discussions/1047

  • I can't agree with this. There is ample evidence of serious flaws since 2021. I hate that. I wish it weren't true. But an objective analysis of the record demands that view.

    Here is a fun one from September (CVE-2025-39816): "io_uring/kbuf: always use READ_ONCE() to read ring provided buffer lengths."

    That is an attackers wet dream right there: bump the length and exfiltrate sensitive data. And it wasn't just some short lived "Linus's branch" work no one actually ran: it existed for a time in, for example, Ubuntu 24.04 LTS (circa 2024 release date.) I just cherry picked that one from among many.

It’s manageable with eBPF instead of seccomp so one has to adapt to that. Should be doable.

  • Maybe not so doable. The whole point of io_uring is to reduce syscalls. So you end up just three. io_uring_setup, io_uring_register, io_uring_enter

    There is now a memory buffer that the user space and the kernel is reading, and with that buffer you can _always_ do any syscall that io_uring supports. And things like strace, eBPF, and seccomp cannot see the actual syscalls that are being called in that memory buffer.

    And, having something like seccomp or eBPF inspect the stream might slow it down enough to eat the performance gain.