Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library
← Back to context

Comment by smcleod

1 year ago

I get around 4-5t/s with the unsloth 1.58bit quant on my home server that has 2x3090 and 192GB of DDR5 Ryzen 9, usable but slow.

3 comments

smcleod

Reply

segmondy  1 year ago

how much context size?

  • smcleod  1 year ago

    Just 4K. Because deepseek doesn't allow for the use of flash attention it means you can't run quantised qkv

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities