Comment by imtringued

2 months ago

Mamba is solving a different problem than transformers.

What Mamba does is take an initial state s_0 and an input u_0, to produce a new state s_1 and an output o_1. It's basically modeling a very complicated state machine. I can easily think of half a dozen applications where this is exactly what you want and it is better than transformers, but LLMs are not among them. Essentially most control problems boil down to what Mamba does. In fact, I would say that Mamba as an architecture is probably the non-plus ultra for modeling mechanical system dynamics.