Comment by gettincrafty
1 day ago
Do you have any links to the papers for the “unRoPE” and “re-Rope” technique? I tried some searching and couldn’t find anything. I would love to look into this idea more.
I think that copy/paste-able KV cache idea sounds pretty promising. It might lose some of the inter-document context and attention that would get built up in the hidden state of the model as it processes the prompt. Maybe just throw in some ‘reasoning’ tokens before it gives its answer to give it a chance to attend cross-document
No comments yet
Contribute on Hacker News ↗