Comment by zbentley

5 months ago

> what you really want is to ask io_uring to allocate the pages itself so that for reads it gives you pages that were allocated by the kernel

Okay, but what about writes? If I have a memory region that I want io_uring to write, it's a major pain in the ass to manage the lifetime of objects in that region in a safe way. My choices are basically: manually manage the lifetime and only allow it to be dropped when I see a completion show up (this is what most everything does now, and it's a) hard to get right and b) limited in many ways, e.g. it's heap-only), or permanently leak that memory as unusable.

You ask the I/O system for a writable buffer. When you fill it up, you hand it off. Once the I/o finishes, it goes back into the available pool of memory to write with. This is how high performance I/O works.

  • Okay, but . . . how would that work? A syscall gives back a pointer (I thought the point was to avoid syscalls/context switches)? An io_malloc userspace function (great, now how do I manage lifetimes of the buffers it hands out)? Something else?

    • The memory is allocated by the runtime that has the io_uring backend. You ask it for memory which it manages in its own memory allocator. Lifetime is managed no differently than Vec. For example, when you drop the DmaBuffer [1] it goes back into the pool. Or you hand it off as an I/O submission after filling it up.

      The memory frequently needs to be mlocked memory anyway, so a general purpose allocator doesn't work.

      [1] https://docs.rs/glommio/latest/glommio/fn.allocate_dma_buffe...