Comment by nromiun
10 days ago
It is an inherent limitation. Multithreading is not free after all. One of the big pros of async programming is the concurrency you get within a single thread. When you make the async runtime multithreaded by default (like Tokio) you don't get this advantage anymore.
You can put tokio in single-threaded mode if you prefer - it's an explicit performance tradeoff. The multithreaded work-stealing executor has higher throughput at the expense of needing more synchronisation.
Or you can schedule your thread-local tasks in a LocalSet to run them all on the owning thread, while keeping the other threads around to handle tasks that are intentionally parallel.
The general theme here is that tokio (and C++ equivalents) provide you the flexibility to do more things than the native Python/Node runtime does (and yes, the defaults take advantage of this). But the underlying intention is the same (and post-GIL we expect to see some movement in this direction on the Python front as well).