Comment by rudhdb773b
15 hours ago
If the focus is performance, why use a separate process and have to deal with data serialization overhead?
Why not a typical shared library that can be loaded in python, R, Julia, etc., and run on large data sets without even a memory copy?
This lets you not even need Python, r, Julia, etc but directly connect to your backend systems that are presumably in a fast language. If Python is in your call stack then you already don’t care about absolute performance.
I owe you a beer!
Perhaps because the performance is good enough and this approach is much simpler and portable than shared libraries across platforms.
Exactly. The objective is to abstract away completely. Shared libraries just add too much overhead.