Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library

Comment by zozbot234

6 hours ago

If the expensive parts of the query happen to work iteratively (especially if agentic), you can act on those loops to bound the cost. Even if it's pure forward generation, you could pause an expensive inference and continue it seamlessly with a cheaper model, adding little to the cost.

0 comments

zozbot234

Reply

No comments yet

Contribute on Hacker News ↗

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities