Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library

Comment by cgdl

11 days ago

Very cool. For the INT4 QAT model, what is the recommended precision for the activations and for the key and values stored in KV cache?

1 comment

cgdl

Reply

hnuser123456  11 days ago

For keys, you probably want to use at least q5 or q6, for values q4 is fine

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities