Comment by light_hue_1
8 months ago
Massive promises of amazing performance but they can't find one convincing example to showcase. It's hard to see what they're bringing to the table when even the simplest possible Haskell code just as fast on my 4 year old laptop with an ancient version of GHC (8.8). No need for an RTX 4090.
module Main where
sum' :: Int -> Int -> Int
sum' 0 x = x
sum' depth x = sum' (depth - 1) ((x \* 2) + 0) + sum' (depth - 1) ((x \* 2) + 1)
main = print $ sum' 30 0
Runs in 2.5s. Sure it's not on a GPU, but it's faster! And things don't get much more high level.
If you're going to promise amazing performance from a high level language, I'd want to see a comparison against JAX.
It's an improvement over traditional interaction nets, sure! But interaction nets have always been a failure performance-wise. Interaction nets are PL equivalent of genetic algorithms in ML, they sound like a cool idea and have a nice story, but then they always seem to be a dead end.
Interaction nets optimize parallelism at the cost of everything else. Including single-threaded performance. You're just warming up the planet by wasting massive amounts of parallel GPU cores to do what a single CPU core could do more easily. They're just the wrong answer to this problem.
You're wrong. The Haskell code is compiled to a loop, which we didn't optimize for yet. I've edited the README to use the Bitonic Sort instead, on which allocations are unavoidable. Past N=20, HVM2 performs 4x faster than GHC -O2.
What? I ran your example, from your readme, where you promise a massive performance improvement, and you're accusing me of doing something wrong?
This is exactly what a scammer would say.
I guess that's the point here. Scam people who don't know anything about parallel computing by never comparing against any other method?
Thanks for the feedback! Some clarifications:
1. I didn't accuse you of doing something wrong, just that your claim was wrong! It has been proven that Interaction Combinators are an optimal model of concurrent computation. I also pointed cases where it also achieves practical efficiency, over-performing GHC's highest optimization level.
2. The performance scaling claimed been indeed been achieved, and the code is open for anyone to replicate our results. The machines used are listed on the repository and paper. If you find any trouble replicating, please let me know!
3. We're not selling any product. Bend is Apache-licensed.
The insult against anyone who pushes back a little bit is not a good sign, I agree. From all we can see now, the massive speedups being claimed have zero optimal baselines. I badly would like to identify these.