Comment by sailingparrot
3 months ago
Indeed what I meant. The LLM isn’t a blank slate at the beginning of each new token during autoregression as the kv cache is there.
3 months ago
Indeed what I meant. The LLM isn’t a blank slate at the beginning of each new token during autoregression as the kv cache is there.
No comments yet
Contribute on Hacker News ↗