Comment by kccqzy

5 days ago

Because in the real world, for code where performance is needed, you run the profiler and either find that the time is spent on I/O, or that the time is spent inside native code.

20 comments

kccqzy

repsilat 5 days ago

This might have been your experience, but mine has been very different. In my experience a typical python workload is 50% importing python libraries, 45% slow python wrapper logic and 5% fast native code. I spend a lot of time rewriting the python logic in C++, which makes it 100x faster, so the resulting performance approaches "10% fast native logic, 90% useless python imports".

kccqzy 5 days ago

There is more than one PEP related to making imports faster such as PEP 690 or PEP 810. It's definitely a well-known problem. The solution is probably right around the corner.
b3orn 5 days ago
Imports being slow is annoying, but only matters to short running code.
- repsilat 5 days ago
  
  Many simple scripts at my work that more or less just argparse and fire off an HTTP request spend half a minute importing random stuff because of false deps and uncommon codepaths. For some unit tests it's 45 seconds, substantially longer than the time taken to run the test logic.
  In dev cycles most code is short-running.
  
  3 replies →
ActorNightly 4 days ago

If imports are slow, you need to not be writing python in the first place, because you are either on limited hardware or you are writing a very performant app.

jonstewart 5 days ago

I do a bit of performance work and find most often that things are mixed: there’s enough CPU between syscalls that the hardware isn’t being full maximized, but there’s enough I/O the CPUs aren’t pegged either. It is rare that the profiler finds an obvious hotspot that yields an easy win; usually it shows that with heavy refactoring you can make 10% of your load several times faster, and then you’ll need to do the same for the next 10% and so on. That is the more typical real world for me, and in that world Python is really awful when compared to rewrite-it-in-Rust.

tialaramex 4 days ago
This "There are no hot spots, it's just a uniform glowing orange" situation is why Google picked C++ and then later Rust and to some extent why they picked Go too.
- jonstewart 4 days ago
  
  I am, indeed, a C++ developer. :-)

lucb1e 5 days ago

When it's a drop-in replacement, as in most of my code (and it's dead simple to try if it runs when you use pypy ./main.py), I wouldn't know why you should run the code 5-50% slower for no reason though

morshu9001 5 days ago

IRL you will have CPU-bottlenecked pure Python code too. But it's not enough to take on the unknown risk of switching to a lesser supported interpreter. Worst case you just put in the effort to convert the hot parts to multiprocessing.

lenerdenator 5 days ago

Also, that engineer time you would spend optimizing for performance costs more than just throwing more hardware at it.

repsilat 5 days ago

For cloud jobs that can be true, but for single threaded dev-in-the-loop work you can't just buy a 100x faster processor than the one on their dev machine, and the latency is expensive workflow friction.
tehjoker 5 days ago

Not if you have certain types of scientific data. You can't rent enough hardware to run the slow code.
morshu9001 5 days ago
That's the thing with single threaded CPU operations, you can't throw more hardware at it
- lenerdenator 4 days ago
  
  In this situation, "more hardware" would mean throwing a faster CPU at it.
  
  1 reply →