Comment by unnouinceput
4 years ago
Nice benchmarking, thank you for your effort.
Also the scientist was leading a team, so my program would've been used by at least 20 people, 10 times per day. That was why Python one was a no go for them from beginning.
Can you share the algorithm, or anything computationally equivalent, for people to try benchmarking different implementations?
The one I wrote 2 years ago? That's intellectual property of said data scientist, not mine. Al I can say is that I parallelize it a lot, hence the entire month. From programming point of view is a mess and hard to follow its ~5k lines. Usually parallel programming is a mess, you should take a look at any parallelization CUDA code available on GitHub.
Was the original just regular python, or numpy?
The C version wasn't GPU-targeted though from your description. I'm curious what other implementations would be capable of, for instance julia, maybe gpu-targeted.
1 reply →