Comment by DrNosferatu
2 months ago
Ok, what FlashAttention changes is space complexity: from O(N^2) to O(N). Time complexity is still ~O(N^2) as with standard Self-Attention.
In other words, optimizes practical runtime through I/O reduction without altering asymptotic complexity
No comments yet
Contribute on Hacker News ↗