Comment by pornel

1 day ago

This can be done with exclusively owned objects. That's how io_uring abstractions work in Rust – you give your (heap allocated) buffer to a buffer pool, and get it back when the operation is done.

&mut references are exclusive and non-copyable, so the hot potato approach can even be used within their scope.

But the problem in Rust is that threads can unwind/exit at any time, invalidating buffers living on the stack, and io_uring may use the buffer for longer than the thread lives.

The borrow checker only checks what code is doing, but doesn't have power to alter runtime behavior (it's not a GC after all), so it only can prevent io_uring abstractions from getting any on-stack buffers, but has no power to prevent threads from unwinding to make on-stack buffer safe instead.

10 comments

pornel

jcranmer 1 day ago

Yes and no.

In my case, I have code that essentially looks like this:

   struct Parser {
     state: ParserState
   }
   struct Subparser {
     state: ParserState
   }
   impl Parser {
     pub fn parse_something(&mut self) -> Subparser {
       Subparse { state: self.state } // NOTE: doesn't work
     }
   }
   impl Drop for Subparser {
     fn drop(&mut self) {
       parser.state = self.state; // NOTE: really doesn't work
     }
   }

Okay, I can make the first line work by changing Parser.state to be an Option<ParserState> instead and using Option::take (or std::mem::replace on a custom enum; going from an &mut T to a T is possible in a number of ways). But how do I give Subparser the ability to give its ParserState back to the original parser? If I could make Subparser take a lifetime and just have a pointer to Parser.state, I wouldn't even bother with half of this setup because I would just reach into the Parser directly, but that's not an option in this case. (The safe Rust option I eventually reached for is a oneshot channel, which is actually a lot of overhead for this case).

It's the give-back portion of the borrow-to-give-back pattern that ends up being gnarly. I'm actually somewhat disappointed that the Rust ecosystem has in general given up on trying to build up safe pointer abstractions in the ecosystem, like doing use tracking for a pointed-to object. FWIW, a rough C++ implementation of what I would like to do is this:

  template <typename T> class HotPotato {
    T *data;
    HotPotato<T> *borrowed_from = nullptr, *given_to = nullptr;

    public:
    T *get_data() {
      // If we've given the data out, we can't use it at the moment.
      return given_to ? nullptr : data;
    }
    std::unique_ptr<HotPotato<T>> borrow() {
      assert(given_to == nullptr);
      auto *new_holder = new HotPotato();
      new_holder->data = data;
      new_holder->borrowed_from = this;
      given_to = new_holder;
    }

    ~HotPotato() {
      if (given_to) {
        given_to->borrowed_from = borrowed_from;
      }
      if (borrowed_from) {
        borrowed_from->given_to = given_to;
      } else {
        delete data;
      }
    }
  };

pornel 18 hours ago

You can implement this in Rust.
It's an equivalent of Rc<Cell<(Option<Box<T>>, Option<Box<T>>)>>, but with the Rc replaced by a custom shared type that avoids keeping refcount by having max 2 owners.
You're going to need UnsafeCell to implement the exact solution, which needs a few lines of code that is as safe as the C++ version.

alfiedotwtf 1 day ago

In my universe, `let` wouldn’t exist… instead there would only be 3 ways to declare variables:

  1. global my_global_var: GlobalType = …
  2. heap my_heap_var: HeapType = …
  3. stack my_stack_var: StackType = …

Global types would need to implement a global trait to ensure mutual exclusion (waves hands).

So by having the location of allocation in the type itself, we no longer have to do boxing mental gymnastics

IX-103 1 day ago
Doesn't Rust do this? `let` is always on the stack. If you want to allocate on the heap then you need a Box. So `let foo = Box::new(MyFoo::default ())` creates a Box on the stack that points to a MyFoo on the heap. So MyFoo is a stack type and Box<MyFoo> is a heap type. Or do you think there is value in defining MyFooStack and MyFooHeap separately to support both use cases?
- kbolino 1 day ago
  
  You may already know this, but let-bindings are not necessarily on the stack. The reference does say they are (it's important to remember that the reference is not normative), and it is often simpler to think of them that way, but in reality they don't have to be on the stack.
  The compiler can perform all sorts of optimizations, and on most modern CPU architectures, it is better to shove as many values into registers as possible. If you don't take the address of a variable, you don't run out of registers, and you don't call other, non-inlined functions, then let-bindings (and function arguments/return values) need not ever spill onto the stack.
  In some cases, values don't even get into registers. Small numeric constants (literals, consts, immutable lets) can simply be inlined as immediate values in the assembly/machine code. In the other direction, large constant arrays and strings don't spill onto the stack but rather the constant pool.
  
  1 reply →
- tele_ski 1 day ago
  
  The suggestion is c# class vs struct basically, with explicit globals which are just class with synchronization
  
  1 reply →
kbolino 1 day ago
But what does "heap my_heap_var" actually mean, without a garbage collector? Who owns "my_heap_var" and when does it get deallocated? What does explicitly writing out the heap-ness of a variable ultimately provide, that Rust's existing type system with its many heap-allocated types (Box, Rc, Arc, Vec, HashMap, etc.) doesn't already provide?
- alfiedotwtf 3 hours ago
  
  > What does explicitly writing out the heap-ness of a variable ultimately provide, that Rust's existing type system with its many heap-allocated types (Box, Rc, Arc, Vec, HashMap, etc.) doesn't already provide?
  To be honest, I was thinking more in terms of cognitive overload i.e. is all that Box boilerplate even needed if we were to treat all `heap my_heap = …” as box underneath? In other words, couldn’t we elide all that away:
  let foo = Box::new(MyFoo::default ());
  Becomes:
  heap foo = MyFoo::default();
  Must nicer!