Comment by wolttam

12 hours ago

It depends on the use-case. yes, 90% of cost is cache in agentic coding scenarios (actually 95% in my experience). But not when the model reasons for 200k+ tokens before answering a complex problem.

3 comments

wolttam

himata4113 12 hours ago

gemini models solve a problem in 80% less tokens so that's something to think about.

johaugum 11 hours ago
Source?
- himata4113 10 hours ago
  
  https://help.kagi.com/kagi/ai/llm-benchmark.html