Comment by rockinghigh
17 hours ago
The MoE experts are quantized to int4, all other weights like the shared expert weights are excluded from quantization and use bf16.
17 hours ago
The MoE experts are quantized to int4, all other weights like the shared expert weights are excluded from quantization and use bf16.
No comments yet
Contribute on Hacker News ↗