Comment by rowanG077
8 months ago
Where did he claim it is fast? As far as I can see the only claim is that it scales linearly with cores. Which it actually seems to do.
8 months ago
Where did he claim it is fast? As far as I can see the only claim is that it scales linearly with cores. Which it actually seems to do.
[flagged]
I'll give you the benefit of the doubt in case README changed, but here's the benchmark it claims, currently, against it's own execution modes:
CPU, Apple M3 Max, 1 thread: 12.15 seconds
CPU, Apple M3 Max, 16 threads: 0.96 seconds
GPU, NVIDIA RTX 4090, 16k threads: 0.21 seconds
The README mentions "fast" in 2 places, none of which is comparing to other languages.
You're missing some context, it's not bitonic sort itself that would present an issue with GPUs, it's the "with immutable tree rotations" part, which in a naive implementation would imply some kind of memory management that would have trouble scaling to thousands of cores.
Yes and those benchmarks are real. Showing linear speed up in the number cores when writing standard code is a real achievement. If you assumed that somehow means this is a state of the art compiler with super blazing performance is on no one but you. The readme lays it out very clearly.
[flagged]
3 replies →