← Back to context

Comment by zoogeny

5 hours ago

As an aside, I built a tool to manage my own chat interface over the provider APIs. I added caching because the savings are quite significant and I have a little countdown timer that shows me how much time remaining until the cache is expired.

However, for the basic turn-based conversation the cache (at 5 minutes) is almost always insufficient. By the time I read the LLM response, consider my next question, write it out, etc. I frequently miss the cache.

I imagine it is much more useful if you have a tool that has a common prefix (like a system instruction, tool specs or common set of context across many users).

If you can get it to work frequently enough the savings are quite worth it.