Comment by semiinfinitely
2 months ago
I see no mention of prior work Hyena Operator which already demonstrated O(n log n) full context mixing several years ago.
2 months ago
I see no mention of prior work Hyena Operator which already demonstrated O(n log n) full context mixing several years ago.
Hyena came out of Albert Gu's prior work in the same lab. https://arxiv.org/abs/2111.00396