← Back to context

Comment by imiric

17 hours ago

Ah, that's good to know, thanks.

But then why is there compounding token usage in the article's trivial solution? Is it just a matter of using the cache correctly?

Cached tokens are cheaper (90% discount ish) but not free

  • Also, unlike OpenAI, Anthropic's prompt caching is explicit (you set up to 4 cache "breakpoints"), meaning if you don't implement caching then you don't benefit from it.

    • thats a very generous way of putting it. Anthropic's prompt caching is actively hostile and very difficult to implement properly.