Comment by vlovich123

1 day ago

I really don’t understand this argument. If you force the user to transfer ownership of the buffer into the I/O subsystem, the system can make sure to transfer ownership of the buffer into the async runtime, not leaving it held within the cancellable future and the future returns that buffer which is given back when the completion is received from the kernel. What am I missing?

10 comments

vlovich123

Inufu 1 day ago

Requiring ownership transfer gives up on one of the main selling points of Rust, being able to verify reference lifetime and safety at compile time. If we have to give up on references then a lot of Rusts complexity no longer buys us anything.

vlovich123 1 day ago
I'm not sure what you're trying to say, but the compile-time safety requirement isn't given up. It would look something like:
self.buffer = io_read(self.buffer)?
This isn't much different than
io_read(&mut self.buffer)?
since rust doesn't permit simultaneous access when a mutable reference is taken.
- Inufu 15 hours ago
  
  It means you can for example no longer do things like get multiple disjoint references into the same buffer for parallel reads/writes of independent chunks.
  Or well you can, using unsafe, Arc and Mutex - but at that point the safety guarantees aren’t much better than what I get in well designed C++.
  Don’t get me wrong, I still much prefer Rust, but I wish async and references worked together better.
  Source: I recently wrote a high-throughput RPC library in Rust (saturating > 100 Gbit NICs)

newpavlov 1 day ago

The goal of the async system is to allow users to write synchronous looking code which is executed asynchronously with all associated benefits. "Forcing" users to do stuff like this shows the clear failure to achieve this goal. Additionally, passing ownership like this (instead of passing mutable borrow) arguably goes against the zero-cost principle.

vlovich123 1 day ago
I don’t follow the zero copy argument. You pass in an owned buffer and get an owned buffer back out. There’s no copying happening here. It’s your claim that async is supposed to look like synchronous code but I don’t buy it. I don’t see why that’s a goal. Synchronous is an anachronistic software paradigm for a computer hardware architecture that never really existed (electronics are concurrent and asynchronous by nature) and cause a lot of performance problems trying to make it work that way.
Indeed, one thing I’ve always wondered is if you can submit a read request for a page aligned buffer and have the kernel arrange for data to be written directly into that without any additional copies. That’s probably not possible since there’s routing happening in the kernel and it accumulates everything into sk_buffs.
But maybe it could arrange for the framing part of the packet and the data to be decoupled so that it can just give you a mapping into the data region (maybe instead of you even providing a buffer, it gives you back an address mapped into your space). Not sure if that TLB update might be more expensive than a single copy.
- newpavlov 1 day ago
  
  You have an inevitable overhead of managing the owned buffer when compared against simply passing mutable borrow to an already existing buffer. Imagine if `io::Read` APIs were constructed as `fn read(&mut self, buf: Vec<u8>) -> io::Resul<Vec<u8>>`.
  Parity with synchronous programming is an explicit goal of Rust async declared many times (e.g. see here https://github.com/rust-lang/rust-project-goals/issues/105). I agree with your rant about the illusion of synchronicity, but it does not matter. The synchronous abstraction is immensely useful in practice and less leaky it is, the better.
  
  3 replies →
- namibj 1 day ago
  
  Such reads are in principle supported if you have sufficient hardware offloading of your stream. AFAIK io_uring got an update a while back specifically to make this practical for non-stream reads, where you basically provide a slab allocator region to the ring and get to tell reads to pick a free slot/slab in that region _only when they actually get the data_ instead of you blocking DMA capable memory for as long as the remote takes to send you the data.