Comment by zarzavat
5 days ago
I continue to believe that free-threading hurts performance more than it helps and Python should abandon it.
Having to have thread safe code all over the place just for the 1% of users who need to have multi-threading in Python and can't use subinterpreters for some reason is nuts.
> Having to have thread safe code all over the place just for the 1% of users who need to have multi-threading in Python and can't use subinterpreters for some reason is nuts.
Way more than 1% of the community, particularly of the community actively developing Python, wants free-threaded. The problem here is that the Python community consists of several different groups:
1. Basically pure Python code with no threading
2. Basically pure Python with appropriate thread safety
3. Basically pure Python code with already broken threaded code, just getting lucky for now
4. Mixed Python and C/C++/Rust code, with appropriate threading behavior in the C or C++ components
5. Mixed Python and C or C++ code, with C and C++ components depending on GIL behavior
Group 1 gets a slightly reduced performance. Groups 2 and 4 get a major win with free-threaded Python, being able to use threading through their interfaces to C/C++/Rust components. Group 3 is already writing buggy code and will probably see worse consequences from their existing bugs. Group 5 will have to either avoid threading in their Python code or rewrite their C/C++ components.
Right now, a big portion of the Python language developer base consists of Groups 2 and 4. Group 5 is basically perceived as holding Python-the-language and Python-the-implementations back.
Where is the major win? Sorry but I just don't see the use case for free-threading.
Native code can already be multi-threaded so if you are using Python to drive parallelized native code, there's no win there. If your Python code is the bottleneck, well then you could have subinterpreters with shared buffers and locks. If you really need to have shared objects, do you actually need to mutate them from multiple interpreters? If not, what about exploring language support for frozen objects or proxies?
The only thing that free threading gives you is concurrent mutations to Python objects, which is like, whatever. In all my years of writing Python I have never once found myself thinking "I wish I could mutate the same object from two different threads".
> Native code can already be multi-threaded so if you are using Python to drive parallelized native code, there's no win there.
When using something like boost::python or pybind11 to expose your native API in Python, it is not uncommon to have situations where the native API is extensible via inheritance or callbacks (which are easy to represent in these binding tools). Today with the GIL you are effectively forced to choose between exposing the native API parallelism or exposing the native API extensibility; e.g. you can expose a method that performs parallel evaluation of some inputs, OR you can expose a user-provided callback to be run on the output of each evaluation, but you cannot evaluate those inputs and run a user-provided callback in parallel.
The "dumbest" form of this is with logging; people want to redirect whatever logging the native code may perform through whatever they are using for logging in Python, and that essentially creates a Python callback on every native logging call that currently requires a GIL acquire/release.
Could some of this be addressed with various Python-specific workarounds/tools? Probably. But doing so is probably also going to tie the native code much more tightly to problematic/weird Pythonisms (in many cases, the native library in question is an entirely standalone project).
> The only thing that free threading gives you is concurrent mutations to Python objects, which is like, whatever.
The big benefit is that you get concurrency without the overhead of multi-process. Shared memory is always going to be faster than having to serialize for inter-process communication (let alone that not all Python objects are easily serializable).
1 reply →
Maybe they could have two versions of the interpreter, one that’s thread-safe and one that’s optimised for single-threading?
Microsoft used to do this for their C runtime library.
That's exactly what we have now and it looks like the python devs want a single unified build at some point
PHP does this as well. Most distributions ship PHP without thread safety, but it's seeing more use now that FrankenPHP uses it. Speaking of which, it would be nice if PHP's JIT got a little love: it's never eked out more than marginal gains in heavily-numeric code.
Pure Python code always needed mutexes for thread safety with or without ol' GIL. I thought the difficulty with removing the GIL instead had to do with C extensions that rely on it.
This is accurate and the parent commenter here seems to be echoing a common misconception. Either they are confused or they need to elaborate more to demonstrate that they have a valid complaint.
For instance, this would have been a valid complaint:
"Users who don't need free threading will now suffer a performance penalty for their single-threaded code."
That is true. But if you are currently using multiple threads, code that was correct before will still be correct in the free threaded build, and code that was incorrect before will still be incorrect.
I got it wrong too, even after reading the docs, until I just tried it for myself. Intuitively, why would Python have threads that can't run fully in parallel but also create race conditions? Seems like the worst of both worlds, and it almost is, except you can still speed up io-bound or even GIL-releasing CPU-bound C calls this way.
Some also mix it up with async JS which also can't use multiple CPUs and guarantees in-order execution until you "await" something. Well now that's asyncio in Python. Doesn't help that so much literature uses muddy terms like "async programming."
I don't want to go too heavy on the negatives, but what's nuts is Python going for trust-the-programmer style multithreading. The risk is that extension modules could cause a lot of crashes.
My understanding is that many extension modules are already written to take advantage of multithreading by releasing the GIL when calling into C code. This allows true concurrency in the extension, and also invites all the hazards of multithreading. I wonder how many bugs will be uncovered in such extensions by the free threaded builds, but it seems like the “nuts” choice actually happened a long time ago.
I also wonder how many people actually need free-threading. And I wonder how useful it will be, when you can already use the ABI to call multi-threaded code.
I think the GIL provides python with a great guarantee, I would probably prefer single-thread performance improvements over multithreading in python to be honest.
Anyway if I need performance, Python would probably not be my first choice