Comment by imiric

1 month ago

Ah, that's good to know, thanks.

But then why is there compounding token usage in the article's trivial solution? Is it just a matter of using the cache correctly?

3 comments

imiric

Cached tokens are cheaper (90% discount ish) but not free

moyix 1 month ago
Also, unlike OpenAI, Anthropic's prompt caching is explicit (you set up to 4 cache "breakpoints"), meaning if you don't implement caching then you don't benefit from it.
- netcraft 1 month ago
  
  thats a very generous way of putting it. Anthropic's prompt caching is actively hostile and very difficult to implement properly.