Comment by zbentley
5 months ago
> what you really want is to ask io_uring to allocate the pages itself so that for reads it gives you pages that were allocated by the kernel
Okay, but what about writes? If I have a memory region that I want io_uring to write, it's a major pain in the ass to manage the lifetime of objects in that region in a safe way. My choices are basically: manually manage the lifetime and only allow it to be dropped when I see a completion show up (this is what most everything does now, and it's a) hard to get right and b) limited in many ways, e.g. it's heap-only), or permanently leak that memory as unusable.
You ask the I/O system for a writable buffer. When you fill it up, you hand it off. Once the I/o finishes, it goes back into the available pool of memory to write with. This is how high performance I/O works.
Okay, but . . . how would that work? A syscall gives back a pointer (I thought the point was to avoid syscalls/context switches)? An io_malloc userspace function (great, now how do I manage lifetimes of the buffers it hands out)? Something else?
The memory is allocated by the runtime that has the io_uring backend. You ask it for memory which it manages in its own memory allocator. Lifetime is managed no differently than Vec. For example, when you drop the DmaBuffer [1] it goes back into the pool. Or you hand it off as an I/O submission after filling it up.
The memory frequently needs to be mlocked memory anyway, so a general purpose allocator doesn't work.
[1] https://docs.rs/glommio/latest/glommio/fn.allocate_dma_buffe...