← Back to context

Comment by josephg

21 hours ago

If we invented a new language, sure. Cooperative multitasking might be a fun approach. The avalanche of bugs I’m imagining would come from existing JavaScript code being run in a different context than that in which it was written and tested. If you pass me a callback right now, and I call a(); callback(); b();. I can guarantee that the program doesn’t yield to the event loop or other executions between a() and b(). As I understand it, this guarantee no longer holds with coop. multitasking because your callback can yield to another thread.

Good on the V8 team. Sounds like they’ve figured out a way to get the performance of green threads with the better ergonomics of effects systems (async await). Great!

You sound like an expert in cooperative multithreading. If async await can use real stacks, what actual benefits are there to cooperative multithreading? Why prefer them over what JS has now? Pitch them to me.

> The avalanche of bugs I’m imagining would come from existing JavaScript code being run in a different context than that in which it was written and tested.

Oh, right. As you said, the ship has sailed. I think you could bolt green threads onto javascript now without ill effects - apart from bloating the language. I can't see anything that could go badly (certainly no avalanche of bugs). But in javascript green threads are only mildly more ergonomic than async. I wouldn't be bloating the language for such a small return.

Rust is a different position. The current async implementation has two big black hairs. Firstly, they had to come up with a type-safe way saving the functions current state. By state, I mean what a function normally stores on its stack. What they came up with is a work of art in some ways, but it doesn't work well with the borrow checker. The borrow checker insists you prove that you have exclusive use of a variable while it exists. Things on the stack have a limited lifetime (the function call), so the compiler knows they don't exist for very long. Even with that small lifetime it's a battle, but it's workable. Async persists that state, usually to the heap, which can effectively live forever. That wreaks havoc with the borrow checker, causing comments like this: https://news.ycombinator.com/item?id=37436274, quote: "Yes, async is effectively a much harder version of Rust ...".

The second issue is colouring. In the current Rust async implementation of large chunks of it is left to libraries, like tokio. Each of these libraries has to provide their own I/O. They aren't compatible. So if you want to use a cute new HTTP server, you are out of luck unless they provided a version that talks to the async library you are using.

The library writers do their best to accommodate by providing interfaces to the popular async libraries. That forces them to do a extra work. Whereas before they could just call `std::file::File::read()`, now they have to abstract all the I/O they do to a different module, and provide an implementation of that module for each async library they want to support.

The outcome can only be described as a mess, and that's putting it politely. It's harming uptake of the language. It wasn't like they didn't know it was coming either - there were comments pleading for a better implementation. And it wasn't as if weren't better solutions weren't already apparent - they had green threads before, they made some wrong turns with its implementation that needed to be fixed. And it's not like these solutions were harder to do than the async implementation they came up with. Async needed new standard library features to stabilise (like `Pin<>`) and introduced new keywords - none of which was needed for green threads. (Although some would be useful for an efficient green thread implementation - like knowing the maximum amount of stack a function could use.)

In the face of all that, they persisted with async. You'd need a sociologist to explain how that happened - to my engineering brain it's inexplicable. Unlike Javascript it isn't just mildly ergonomic implementation of the same thing, it's a serious mistake - well worth the effort of throwing out and replacing.

  • On all that, we have near total agreement. I've been complaining about how broken and half-baked rust's async story is for years - for more or less the same reasons you list above:

    - You can't name the type of a impl Future.

    - They play terribly with the borrow checker because the borrow checker can't handle self referential types.

    - There's no future executor in the standard library. You need 3rd party libraries. And the most common library is tokio, which is a whale.

    - Despite all the work, there's still no async streams in the language.

    - Pin. !Unpin. pin_project. Unsafe pin_project. What are we even doing.

    But async works really well in javascript. Maybe where we disagree is that I don't think any of these issues are because async itself is a bad idea. But, async has become the place dreams go to die in rust. Look at the issues above. They're all problems with rust's type system, borrow checker and standard library.

    What I think rust needs is:

    - A way to have self-borrows in a struct. Types with self borrows would be implicitly pinned.

    - A way to name the return value of a function. Eg let x: ReturnType<some_func>. People have been saying this is right around the corner since 2019.

    - Generators. Futures are built on top of generators inside the compiler. But generators have - for some reason - never been exposed in stable rust. I think generators should have been stabilised first - since all the problems you need to solve to make generators work well (self referential types, return values you can name, etc) are things futures need too.

    Unfortunately I think that ship has sailed too. I try to avoid async rust whenever I can. Its such a pity. I'm hoping someone makes a rust 2.0 language at some point which fixes this situation.

    • > I think generators should have been stabilised first - since all the problems you need to solve to make generators work well (self referential types, return values you can name, etc) are things futures need too.

      Generators are an interesting case. For example, if you implemented a Vec iterator as a generator, it becomes:

          fn vec_iter(&self) {
             for index in 0..self.len() {
                 yield &self[i];
             }
          }
      

      Which is arguably easier to understand than the current event driven formulation, which required you to declare a new type to hold your state, and the code looks like:

          fn next(&self) {
             if (self.index >= self.vec.len()) {
                  None
             } else {
                  self.index += 1;
                  &self.vec[self.index - 1]
             }
          }
      

      Effectively the stack frame has become your type, and sequential code is always so much more compact and clearer than the event driven model. The generator could be implemented as a green thread, but you would never entertain the overhead of creating the new stack needed by the green thread implementation.

      However ... the async implemented all the mechanics needed to get rid of that green thread stack allocation when the size of the stack is known, as it is in this case. The state saving stuff they created for async could be used to translate that stack to a type. It would, surprise, surprise, contain just `index` - analogous the iterator type we have to manually create for event drive code. So compiler could translate the green thread to the same implementation as the event driven code, but you get to use the compact (and very familiar) syntax of a stack machine.

      I found it interesting to see what happens for a more complex generator - like something that returns every node in a tree. You can do it recursively, which is simple clear code, but you don't know the size of the stack so the trick used for the vec iterator (translating it to a type) can't be used. Or you can manually store the state you stored in the stack with a recursive implementation in a Vec<> instead. Both require a memory allocation, but they are different. One is just normal malloc that must be reallocated and moved as the allocation grows. The other can use the OS's stack implementation, that doesn't move as it grows. If you re-used stacks, the OS's stack implementation would be faster in a long running program.

      Notice that the transformation from a generator to async implementation is arguably more complex than the same transformation for green threads, especially for the tree traversal.

      That observation is one of the reasons I'm such a strong proponent of green threads. The other is a simpler mental model. Unlike async, you don't have to expose the inner mechanisms it depends on, like futures.

      1 reply →