Comment by astrange
3 months ago
Predicting the next word is the interface, not the implementation.
(It's a pretty constraining interface though - the model outputs an entire distribution and then we instantly lose it by only choosing one token from it.)
No comments yet
Contribute on Hacker News ↗