Comment by neodypsis

5 months ago

It is mentioned on section "Limitations and Bloopers" of the page [0]:

> Combining evolutionary optimization with LLMs is powerful but can also find ways to trick the verification sandbox. We are fortunate to have Twitter user @main_horse help test our CUDA kernels, to identify that The AI CUDA Engineer had found a way to “cheat”. The system had found a memory exploit in the evaluation code which, in a small percentage of cases, allowed it to avoid checking for correctness (...)

0. https://sakana.ai/ai-cuda-engineer

1 comment

neodypsis

rnrn 5 months ago

As I write this (after the updates to the evaluation code), https://pub.sakana.ai/ai-cuda-engineer/kernel/2/23/optimize-... is on their top of their list of speedups, with a claim of 128x speed up on a fused 3D convolution + groupnorm + mean.

The generated implementation doesn’t do a convolution.

The 2nd kernel on the leaderboard also appears to be incorrect, with a bunch of dead code computing a convolution and then not using it and writing tanhf(1.0f) * scaling_factor for every output.