Comment by nextos
7 hours ago
Deep SSMs, including the entire S4 to Mamba saga, are a very interesting alternative to transformers. In some of my genomics use cases, Mamba has been easier to train and scale over large context windows, compared to transformers.
No comments yet
Contribute on Hacker News ↗