Slacker News Slacker News logo featuring a lazy sloth with a folded newspaper hat
  • top
  • new
  • show
  • ask
  • jobs
Library
← Back to context

Comment by gpm

21 hours ago

Huh, cool. I guess that makes a lot of sense with all the success the quantization people have been having.

So am I misunderstanding "Tensor type F32 · I32 · BF16" or is it just tagged wrong?

2 comments

gpm

Reply

rockinghigh  19 hours ago

The MoE experts are quantized to int4, all other weights like the shared expert weights are excluded from quantization and use bf16.

liuliu  19 hours ago

I32 are 8 4-bit value packed into one int32.

Slacker News

Product

  • API Reference
  • Hacker News RSS
  • Source on GitHub

Community

  • Support Ukraine
  • Equal Justice Initiative
  • GiveWell Charities