← Back to context

Comment by EGreg

19 days ago

You’re not wrong

Using an LLM and caching eg FAQs can save a lot of token credits

AI is basically solving a search problem and the models are just approximations of the data - like linear regression or fourier transforms.

The training is basically your precalculation. The key is that it precalculates a model with billions of parameters, not overfitting with an exact random set of answers hehe

> Using an LLM and caching eg FAQs can save a lot of token credits

Do LLM providers use caches for FAQs, without changing the number of tokens billed to customer?

  • No, why would they. You are supposed to maintain that cache.

    What I really want to know is about caching the large prefixes for prompts. Do they let you manage this somehow? What about llama and deepseek?