Comment by crdoconnor
8 years ago
This is true but python's relative slowness (along with the GIL) is an issue that is regularly blown out of all proportion.
Part of the reason for the language's success is because it made intelligent tradeoffs that often went against the grain of the opinions of the commentariat and focused on its strengths rather than pandering to the kinds of people who write language performance comparison blog posts.
If speed were of primary importance then PyPy would be a lot more popular.
You're conflating two kinds of "performance", startup latency and steady state throughput. We're talking about the former, and you're proposing improvements for the latter. In fact, moving to pypy is exactly what you shouldn't do to improve startup.
It's surprising but frequently true that startup latency has a greater effect on the perception of performance than actual throughput. Nobody likes to type a command and then be kept waiting, even if the started program could in principle demonstrate amazing feats of computation once warmed up.
The GIL is a pretty nasty problem once you try to scale things beyond one core.
Simply try something like unpickling a 10 GB data structure while keeping your GUI in the main thread responsive. You cannot do that because the GIL locks up everything while modifying data structures. Move the data to another process instead of another thread. Great, your GUI is responsive but you can't access the data from the main thread.
You can say that such a humongous data structure is wrong or that a GUI isn't meant to be responsive or programmed in Python or that I'm holding it wrong. Probably right.
I've flailed around with this a few times in the last year or so and have found that posting things up and down a multiprocessing.Pipe is the least painful alternative.
So you're basically building a distributed application just because you can't share memory properly. This can be very efficient if little communication is involved or a total nightmare if you have gigabytes of data where you need lots of random read access to walk the data structures at high speed. If you're not careful you spend most of your time pickling and unpickling the stuff you send over your pipes while requiring duplication of your gigabyte data structures in order to gain at least some parallelism.
I don't see a way around this mess with the current structure of python. You would have to reimplement the data heavy part completely in another language that provides proper threading models.
"You're holding it wrong" is a poor response to a wide audience, like iPhone users. But it's an OK response to a specialist, like someone tackling the task you describe.
I'm a professional Python developer and I run into performance problems a lot. Python makes things really hard for even specialists to "hold right". Contrast that with Go, which (for all the hate it gets) writes very alike well-formed Python in single-threaded applications, and writes how you would like to write Python in parallel applications. And all the while being two orders of magnitude faster. If we don't start taking performance seriously in the Python community, Go (or someone else) will eat our lunch sooner or later.
7 replies →
Python derives a good chunk of its speed (if not all of it) from carefully tuned libraries written in other languages (or even for other architectures in the case of many machine learning packages). As soon as you try to do a lot of heavy processing python even the compiled versions quickly bog down. IMO the best way to use python is to use it to cleverly glue together highly optimized code. That way you spend the minimum amount of effort and you get maximum performance.
Multithreading is the glue I need. How am I supposed to write optimized native module to spawn threads to do computation in numpy and pandas?
Yeah, that was kind of my point :/