Comment by gettincrafty

7 months ago

Do you have any links to the papers for the “unRoPE” and “re-Rope” technique? I tried some searching and couldn’t find anything. I would love to look into this idea more.

I think that copy/paste-able KV cache idea sounds pretty promising. It might lose some of the inter-document context and attention that would get built up in the hidden state of the model as it processes the prompt. Maybe just throw in some ‘reasoning’ tokens before it gives its answer to give it a chance to attend cross-document

0 comments

gettincrafty

No comments yet

Contribute on Hacker News ↗