Comment by anaccount342

5 days ago

I don't know how realistic only using a benchmark that only uses tight loops and integer operations. Something with hashmaps and strings more realistically represents everyday cpu code in python; most python users offload numeric code to external calls.

There is no "realistic" benchmark, all benchmarks are designed to measure in a specific way. I explain what my goals were in the article, in case you are curious and want to read it.

I agree with you, this is not an in depth look, could have been much more rigorous.

But then I think in some ways it's a much more accurate depiction of my use case. I mainly write monte-carlo simulations or simple scientific calculations for a diverse set of problems every day. And I'm not going to write a fast algorithm or use an unfamiliar library for a one-off simulation, even if the sim is going to take 10 minutes to run (yes I use scipy and numpy, but often those aren't the bottlenecks). This is for the sake of simplicity as I might iterate over the assumptions a few times, and optimized algorithms or library impls are not as trivial to work on or modify on the go. My code often looks super ugly, and is as laughably unoptimized as the bubble sort or fib(40) examples (tail calls and nested for loops). And then if I really need the speed I will take my time to write some clean cpp with zmq or pybind or numba.

It's still interesting though. If the most basic thing isn't notably faster, it makes it pretty likely the more complex things aren't either.

If your actual load is 1% python and 99% offloaded, the effect of a faster python might not mater a lot to you, but to measure python you kinda have to look at python

Or have it run some super common use case like a FastAPI endpoint or a numpy calculation. Yes, they are not all python, but it's what most people use Python for.

  • FastAPI is a web framework, which by definition is (or should be!) an I/O bound process. My benchmark evaluates CPU, so it's a different thing. There are a ton of web framework benchmarks out there if you are interested in FastAPI and other frameworks.

    And numpy is a) written in C, not Python, and b) is not part of Python, so it hasn't changed when 3.14 was released. The goal was to evaluate the Python 3.14 interpreter. Not to say that it wouldn't be interesting to evaluate the performance of other things as well, but that is not what I set out to do here.

    • That's the thing with Python: A lot of things should be bound by all kinds of limitations, but are in practice often limited by the Python interpreter if not done carefully.

      Fundamentally for example, if you're doing some operations on numpy arrays like: c = a + b * c, interpreted numpy will be slower than compiled numba or C++ just because an eager interpreter will never fuse those operations into an FMA.

    • Numpy is partly written in C but includes a lot of Python code. If you include scipy or scikit learn or pandas, most of the code is python calling primitive numpy C operations. I'd expect that many semi-complex data science programs to benefit from improvement in the python interpreter, especially if they weren't written in super tight numpy code.