Comment by gorbypark
2 years ago
Slightly unrelated, since each model is trained and tunes for specific task(s), but the original transformer architecture and paper was built with translation in mind. The original performance tests were language translation benchmarks.
No comments yet
Contribute on Hacker News ↗