Comment by saagarjha

5 months ago

Wasn’t this a bunch of kernels that didn’t work?

7 comments

saagarjha

Reply

t55 5 months ago

What do you mean?

imtringued 5 months ago
They don't verify the correctness of their kernels. They expect you to pick the working ones from their kernel junkyard yourself.
The very idea is also dumb as hell. They could have done CUDA -> HIP/oneAPI/Metal/Vulkan/SYCL/OpenCL. Then they wouldn't need to beat the performance of anything, just the automatic porting would be worth an acquisition by AMD or Intel.
- bwfan123 5 months ago
  
  Problem with startups like Devin (AI sw engineer) and Sakana (AI research scientist) is that they are full of hot-air.
  They get caught up in the hype, and focus on the marketing and not the essential engineering.
pavelstoev 5 months ago
The hallucinated code was reusing memory buffers filled with previous results so not performing the actual computations. When this was fixed the AI generated code was like 0.3x of the baseline.
- neodypsis 5 months ago
  
  It is mentioned on section "Limitations and Bloopers" of the page [0]:
  > Combining evolutionary optimization with LLMs is powerful but can also find ways to trick the verification sandbox. We are fortunate to have Twitter user @main_horse help test our CUDA kernels, to identify that The AI CUDA Engineer had found a way to “cheat”. The system had found a memory exploit in the evaluation code which, in a small percentage of cases, allowed it to avoid checking for correctness (...)
  0. https://sakana.ai/ai-cuda-engineer
  
  1 reply →

tsunego 5 months ago

[dead]