Comment by yumraj
15 hours ago
> Since Claude Code uses a 1 hour prompt cache window for the main agent, if you leave your computer for over an hour then continue a stale session, it's often a full cache miss. To improve this, we have shipped a few UX improvements (eg. to nudge you to /clear before continuing a long stale session), and are investigating defaulting to 400k context instead
I don’t understand this. I frequently have long breaks. I never want to clear or even compact because I don’t want to lose the conversations that I’ve had and the context. Clearing etc causes other issues like I have to restate everything at times and it misses things. I do try to update the memory which helps. I wish there was a better solution than a time bound cache
Makes me wish that shortly before the server-side expiration, we could save the cache on the client-side, indefinitely.
But my understanding is that we're talking about ~60GB of data per session, so it sounds unrealistic to do...
Where are you getting 60GB from? It shouldn’t be that large.
But yes, would love to save context/cache such that it can be played back/referred to if needed.
/compact is a little black box that I just have to trust that is keeping the important bits.
The KV cache consists of activation vectors for every attention head at every layer of the model for every token, so it gets quite large. ChatGPT also estimates 60-100GB for full token context of an Opus-sized model:
https://chatgpt.com/share/69dc5030-268c-83e8-92c2-6cef962dc5...