Comment by gorbypark

2 years ago

Slightly unrelated, since each model is trained and tunes for specific task(s), but the original transformer architecture and paper was built with translation in mind. The original performance tests were language translation benchmarks.

0 comments