← Back to context

Comment by saltcured

3 years ago

I used it for real once over a decade ago, when I had to help some researchers who wanted to load an archive of Twitter JSON dumps into an RDBMS. This was basically cleaning/transliterating data fields into CSV that could bulk-import into PostgreSQL. I think we were using Python 2.7 back then.

1. The same naive deserialization and dict processing code ran much faster with PyPy.

2. Conveniently, PyPy also tolerated some broken surrogate pairs in Twitter's UTF8 stream, which threw exceptions when trying to decode the same events with the regular Python interpreter.

I've had some web service code where I wished I could easily swap to PyPy, but these were conservative projects using Apache + mod_wsgi daemons with SE-Linux. If there were a mod_wsgi_pypy that could be a drop-in replacement, I would have advocated for trials/benchmarking with the ops team.

Most other performance-critical work for me has been with combinations of numpy, PyOpenCL, PyOpenGL, and various imaging codecs like `tifffile` or piping numpy arrays in/out of ffmpeg subprocesses.