Comment by rudhdb773b

15 hours ago

If the focus is performance, why use a separate process and have to deal with data serialization overhead?

Why not a typical shared library that can be loaded in python, R, Julia, etc., and run on large data sets without even a memory copy?

4 comments

rudhdb773b

bob001 12 hours ago

This lets you not even need Python, r, Julia, etc but directly connect to your backend systems that are presumably in a fast language. If Python is in your call stack then you already don’t care about absolute performance.

kossisoroyce 9 hours ago

I owe you a beer!

sriram_malhar 15 hours ago

Perhaps because the performance is good enough and this approach is much simpler and portable than shared libraries across platforms.

kossisoroyce 9 hours ago

Exactly. The objective is to abstract away completely. Shared libraries just add too much overhead.