Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library
← Back to context

Comment by cgdl

10 days ago

Very cool. For the INT4 QAT model, what is the recommended precision for the activations and for the key and values stored in KV cache?

1 comment

cgdl

Reply

hnuser123456  10 days ago

For keys, you probably want to use at least q5 or q6, for values q4 is fine

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities