← Back to context

Comment by Amekedl

7 months ago

I'm thinking reading numbers like this is really just slop lately.

FA achieving a 32.5% speed up? Cool.

Why not submit it as a PR to the Flash Attention repo then? Can I read about it more in detail?

I have not read this linked article, but your comment made me recall a discussion about a speed up of CUDA kernels presented by Sakana AI Labs. The researcher Ravid Shwartz Ziv at NYU posted about it on LinkedIn [1], and here is the Twitter post of interest [2]

""" Yesterday's news about Sakana AI Labs provided an important lesson for all of us working with AI agents. Their announcement of an AI system that could supposedly optimize CUDA kernels to run 100x faster initially seemed like exactly the kind of use cases we've been hoping for in AI-assisted development.

Like many others, I was excited about it. After all, isn't this exactly what we want AI to do - help us optimize and improve our technical systems?

However, careful investigation by the community (on Twitter) revealed a different story. What really happened? The AI-generated CUDA kernel appeared to achieve incredible speedups, but the code was inadvertently reusing memory buffers containing previous results, essentially bypassing the actual computation. When properly evaluated, the kernel actually runs about 3x slower than the baseline. """

[1] https://www.linkedin.com/posts/ravid-shwartz-ziv-8bb18761_ye...

[2] https://x.com/main_horse/status/1892473238036631908

  • lmao this is exactly the kind of stuff I always see from Claude. It’s like adding a Skip() to a test and declaring it works now. “Well it’s a lot faster, I met the criteria of my TODOs cya”

    I’ve seen it so much I kinda doubt it was “inadvertent” because they’re like seemingly intentional about their laziness, and will gaslight you about it too.

    • So annoying. Also, when it hardcodes the expected response in a mock, bypassing the purpose entirely. “Test passes now!”

      Funny, 5 years ago we had these same complaints, but about (some) people.

    • Well you forgot to fully qualify your linguistic basis and semantic interpretation of the text of your wish to the great genie bottle.

I assume the Gemini results are JAX/PAX-ML/Pallas improvements for TPUs so would look there for recent PRs