Comment by bcherny
15 hours ago
> 1hr -> 5min on March 6th
This is not accurate. The main agent typically uses a 1h cache (except for API customers, which can enable 1h but it is not on by default because it costs more). Sub-agents typically use a 5m cache.
https://github.com/anthropics/claude-code/issues/46829#issue... - Have you checked with your colleague? (and his AI, of course)
Doesn't what's said at the link approximately agree? The 5m bug was said to be isolated to use of overage (API billing).
Then my original question stands: why did this become an issue seemingly overnight if nothing changed?
So if I run a test suite or compile my rust program in a sub agent I’m going to get cache misses? Boo.
Sub agents don't have much context and don't stay around for long, so misses in that case are trivial.
As of yesterday subagents were often getting the entire session copied to them. Happened to me when 2 turns with Claude spawned a subagent, caused 2 compactions, and burned 15% of my 5-hour limit (Max 5x).
how long they stay around after the cache miss is irrelevant if I am burning all the prior tokens again. also, how much context they have depends entirely on the task and your workflow. I you have a subagent implement a feature and use the compile + test loop to ensure it is implemented correctly before a supervisor agent reviews what was implemented vs asked then yes, subagents do have a lot of context.
... so how do API users enable 1hr caching? I haven't found a setting anywhere.
would like to know this too ;D
there is env.ENABLE_PROMPT_CACHING_1H_BEDROCK - but that is - as the name says "when using Bedrock"
for the raw API the docs are also clear -> "ttl": "1h" https://platform.claude.com/docs/en/build-with-claude/prompt...
but how to make claude-code send that when paying by API-key? or when using a custom ANTHROPIC_BASE_URL? (requests will contain cache_control, but no ttl!)