Comment by namibj
1 day ago
Basically you can generate the next two tokens at once in the same matmul, and rollback to one-at-a-time when your generation said you guessed wrong (as that will mean the second of your pair you generated was generated based on revoked context).
No comments yet
Contribute on Hacker News ↗