They don't verify the correctness of their kernels. They expect you to pick the working ones from their kernel junkyard yourself.
The very idea is also dumb as hell. They could have done CUDA -> HIP/oneAPI/Metal/Vulkan/SYCL/OpenCL. Then they wouldn't need to beat the performance of anything, just the automatic porting would be worth an acquisition by AMD or Intel.
The hallucinated code was reusing memory buffers filled with previous results so not performing the actual computations. When this was fixed the AI generated code was like 0.3x of the baseline.
Wasn’t this a bunch of kernels that didn’t work?
What do you mean?
They don't verify the correctness of their kernels. They expect you to pick the working ones from their kernel junkyard yourself.
The very idea is also dumb as hell. They could have done CUDA -> HIP/oneAPI/Metal/Vulkan/SYCL/OpenCL. Then they wouldn't need to beat the performance of anything, just the automatic porting would be worth an acquisition by AMD or Intel.
1 reply →
The hallucinated code was reusing memory buffers filled with previous results so not performing the actual computations. When this was fixed the AI generated code was like 0.3x of the baseline.
2 replies →
[dead]