Comment by kang
17 hours ago
> tokens written to cache all at once, which would eat up a significant % of your rate limits
Construction of context is not an llm pass - it shouldn't even count towards token usage. The word 'caching' itself says don't recompute me.
Since the devs on HN (& the whole world) is buying what looks like nonsense to me - what am I missing?
No comments yet
Contribute on Hacker News ↗