The hallucinated code was reusing memory buffers filled with previous results so not performing the actual computations. When this was fixed the AI generated code was like 0.3x of the baseline.
They don't verify the correctness of their kernels. They expect you to pick the working ones from their kernel junkyard yourself.
The very idea is also dumb as hell. They could have done CUDA -> HIP/oneAPI/Metal/Vulkan/SYCL/OpenCL. Then they wouldn't need to beat the performance of anything, just the automatic porting would be worth an acquisition by AMD or Intel.
Wasn’t this a bunch of kernels that didn’t work?
What do you mean?
The hallucinated code was reusing memory buffers filled with previous results so not performing the actual computations. When this was fixed the AI generated code was like 0.3x of the baseline.
2 replies →
They don't verify the correctness of their kernels. They expect you to pick the working ones from their kernel junkyard yourself.
The very idea is also dumb as hell. They could have done CUDA -> HIP/oneAPI/Metal/Vulkan/SYCL/OpenCL. Then they wouldn't need to beat the performance of anything, just the automatic porting would be worth an acquisition by AMD or Intel.
1 reply →
[dead]