Comment by bcherny

19 hours ago

> 1hr -> 5min on March 6th

This is not accurate. The main agent typically uses a 1h cache (except for API customers, which can enable 1h but it is not on by default because it costs more). Sub-agents typically use a 5m cache.

9 comments

bcherny

throwdbaaway 19 hours ago

https://github.com/anthropics/claude-code/issues/46829#issue... - Have you checked with your colleague? (and his AI, of course)

fluidcruft 18 hours ago

Doesn't what's said at the link approximately agree? The 5m bug was said to be isolated to use of overage (API billing).

mvkel 14 hours ago

Then my original question stands: why did this become an issue seemingly overnight if nothing changed?

aaronblohowiak 19 hours ago

So if I run a test suite or compile my rust program in a sub agent I’m going to get cache misses? Boo.

skeledrew 17 hours ago
Sub agents don't have much context and don't stay around for long, so misses in that case are trivial.
- HumanOstrich 11 hours ago
  
  As of yesterday subagents were often getting the entire session copied to them. Happened to me when 2 turns with Claude spawned a subagent, caused 2 compactions, and burned 15% of my 5-hour limit (Max 5x).
- aaronblohowiak 10 hours ago
  
  how long they stay around after the cache miss is irrelevant if I am burning all the prior tokens again. also, how much context they have depends entirely on the task and your workflow. I you have a subagent implement a feature and use the compile + test loop to ensure it is implemented correctly before a supervisor agent reviews what was implemented vs asked then yes, subagents do have a lot of context.

highd 15 hours ago

... so how do API users enable 1hr caching? I haven't found a setting anywhere.

g4cg54g54 11 hours ago

would like to know this too ;D
there is env.ENABLE_PROMPT_CACHING_1H_BEDROCK - but that is - as the name says "when using Bedrock"
for the raw API the docs are also clear -> "ttl": "1h" https://platform.claude.com/docs/en/build-with-claude/prompt...
but how to make claude-code send that when paying by API-key? or when using a custom ANTHROPIC_BASE_URL? (requests will contain cache_control, but no ttl!)