Comment by astrange
1 day ago
Predicting the next word is the interface, not the implementation.
(It's a pretty constraining interface though - the model outputs an entire distribution and then we instantly lose it by only choosing one token from it.)
No comments yet
Contribute on Hacker News ↗