Comment by fragebogen

3 years ago

I'm running a constrained convex optimization project at work, where we need as close to real time (<10s is great, <1min is acceptable) responses for a web interface.

Basically I'm using a SciPy exclusively for the optimization routine:

* minimize(method="SLSQP") [0]

* A list comprehention which calls ~10-500 pre-fitted PchipInterpolator [1] functions and stores the values as a np.array().

The Pchip functions (and it's first derivatives) are used in the main opt function as well as in several constraints.

Most jobs took about 10 seconds but the long tail might take up to 10 min some times. I tried the pypy 3.8 (7.3.9), and saw similar compute times on the shorter jobs, but roughly ~2x slower compute times on the heavier jobs. This obviously was not what I expected, but I had very limited experience with pypy and didn't know how to debug further.

Eventually python 3.10 came around and gave 1.25x speed increase, and then 3.11 which gave another 1.6-1.7x increase which gave a decent ~2x cumulative speedup, but the occasional heavy jobs still stay in the 5 min range and would have been nicer in the 10-30s obviously.

Still I would like to say that trying pypy out was a quite smooth experience, staying within scipy land, took me half a day to switch and benchmark. But if anyone else has experience with pypy and scipy, knowing some obvious pitfalls, it would be much appreciated to hear.

[0] https://docs.scipy.org/doc/scipy/reference/optimize.minimize...

[1] https://docs.scipy.org/doc/scipy/reference/generated/scipy.i...

2 comments

fragebogen

mattip 3 years ago

If you find your bottlenecks in SciPy or Numpy, then PyPy will not help. Those are primarily written in C, so the PyPy JIT cannot peer inside and do any magic.

fragebogen 3 years ago

My effort was rather in trying to speed up all the python looping, etc. around the np calls. But I never went far trying to actually benchmark the entire pipeline in order to find out what was the actual bottleneck.