Comment by doph

11 hours ago

is a kv cache not a kind of state? what does statefulness have to do with selfhood? how does a system prompt work at all if these things have no reference to themselves?

3 comments

doph

danpalmer 11 hours ago

The kv cache is not persistent. It's a hyper-short-term memory.

in-silico 9 hours ago
Modern kv caches can contain up to 1 million tokens (~3000 pages of text). It's not that short, it's like 48 straight hours of reading.
- danpalmer 3 hours ago
  
  Yes and no, it's not just text, it's images, video, etc, and it's not just the pages of content, it's also all the "thinking" as well. Plus the models tend to work better earlier on in the context.
  I regularly get close to filling up context windows and have to compact the context. I can do this several times in one human session of me working on a problem, which you could argue is roughly my own context window.
  My point though was that almost nothing of the model's knowledge is in the context, it's all in the training. We have no functional long term memory for LLMs beyond training.