Comment by ledauphin
18 hours ago
nah this doesn't explain it.
most of the users of those third party harnesses care just as much about hitting cache and getting more usage.
18 hours ago
nah this doesn't explain it.
most of the users of those third party harnesses care just as much about hitting cache and getting more usage.
I'm watching a conference talk right now from 2 weeks ago: "I Hated Every Coding Agent So I Built My Own - Mario Zechner (Pi)", and in the middle he directly references this.
He demonstrates in the code that OpenCode aggressively trims context, by compacting on every turn, and pruning all tool calls from the context that occurred more than 40,000 tokens ago. Seems like it could be a good strategy to squeeze more out of the context window - but by editing the oldest context, it breaks the prompt cache for the entire conversation. There is effectively no caching happening at all.
https://youtu.be/Dli5slNaJu0
Sure. The question is whether they have the same level of expertise and prioritization that Anthropic does.
They are working with the same tools and knowledge like Anthropic does as Caching practices are documented. And they have as much incentive as Anthropic does to not waste compute. Can we stop acting like people who build harnesses be it Opencode oder Mario Zechners Pi are dumbfucks who don't understand caching?