Comment by rishabhaiover

5 hours ago

After a certain amount of context usage, I think I empirically see the stated issues with Top-K compression strategy. It doesn't catastrophically forget but nuances fade as I reach towards the tail end of my context limits.

1 comment

rishabhaiover

jchandra 3 hours ago

Yeah, that’s consistent. topK keeps the obvious tokens, but subtle context gets eroded over time rather than dropped all at once.