← Back to context

Comment by jszymborski

4 months ago

"LSTMs brought essentially unlimited depth to supervised RNNs"

LSTMs are an incredible architecture, I use them a lot in my research. While LSTMs are useful over many more timesteps than other RNNs, LSTMs certainly don't offer 'essentially unlimited depth'.

When training LSTMs whose input were sequences of amino acids, whose length easily top 3,000 timesteps, I got huge amounts of instability... with gradients rapidly vanishing. Tokenizing the AAs, getting the number of timesteps down to more like 1,500, has made things way more stable.