Comment by twobitshifter
1 year ago
Agreed, Ilya Sutskever himself has spent a long time with lstm and published papers like this one while working at Google. http://proceedings.mlr.press/v37/jozefowicz15.pdf
Recent comments from him have said that any architecture can achieve transformer accuracy and recall, but we have devoted energy to refining transformers, due to the early successes.
No comments yet
Contribute on Hacker News ↗