Comment by zozbot234
5 hours ago
Grug says prompt caching just store KV-cache which is sequenced by token. Easy cut it back to just before edit. Then regenerate after is just like prefill but tiny.
5 hours ago
Grug says prompt caching just store KV-cache which is sequenced by token. Easy cut it back to just before edit. Then regenerate after is just like prefill but tiny.
No comments yet
Contribute on Hacker News ↗