Comment by semiinfinitely
1 year ago
I see no mention of prior work Hyena Operator which already demonstrated O(n log n) full context mixing several years ago.
1 year ago
I see no mention of prior work Hyena Operator which already demonstrated O(n log n) full context mixing several years ago.
Hyena came out of Albert Gu's prior work in the same lab. https://arxiv.org/abs/2111.00396