Comment by AnotherGoodName

9 days ago

Especially since LLM tech was originally developed for translation. That’s the original reason so much work was done to create a model that could handle context and it turned out that was helpful in more areas than just translation.

While LLM usage is just spinning up in other areas, for translation they have been doing this job well for over 5 years now.

Specifically, GNMT came out in 2016, which is 9 years ago.

GNMT used seq2seq with attention to do translations. GNMT plus some RNN and attention lead to transformers, and here we are today.