Comment by j-pb

15 hours ago

The /clear nudge isn't a solution though. Compacting or clearing just means rebuilding context until Claude is actually productive again. The cost comes either way. I get that 1M context windows cost more than the flat per-token price reflects, because attention scales with context length, but the answer to that is honest pricing or not offering it. Not annoying UX nudges. What’s actually indefensible is that Claude is already pushing users to shrink context via, I presume, system prompt. At maybe 25% fill:

  “This seems like a good opportunity to wrap it up and continue in a fresh context window.”
  “Want to continue in a fresh context window? We got a lot of work done and this next step seems to deserve a fresh start!”

If there’s a cost problem, fix the pricing or the architecture. But please stop the model and UI from badgering users into smaller context windows at every opportunity. That is not a solution, it’s service degradation dressed as a tooltip.

2 comments

j-pb

foota 3 hours ago

The cost issues they're seeing (at least from what they've stated) are from users, not internally. Basically, it takes either $5 or $6.25 (depending on 5m or 1h ttl) to re-ingest a 1M context length conversation into cache for opus 4.6, that's obviously a very high cost, and users are unhappy with it.

I think 400k as a default seems about right from my experience, but just having the ability to control it would be nice. For the record, even just making a tool call at 1M tokens costs 50 cents (which could be amortized if multiple calls are made in a round), so imo costs are just too high at long context lengths for them to be the default.

g4cg54g54 4 hours ago

currently "clear makes it worse" https://github.com/anthropics/claude-code/issues/47098 + https://github.com/anthropics/claude-code/issues/47107

launching with `CLAUDE_CODE_DISABLE_GIT_INSTRUCTIONS=1 claude "Hello"` till those are fixed seems to be th way