Comment by bityard

1 day ago

Halving the precision of the weights is not a free lunch...

3 comments

bityard

Q8 is virtually lossless. The quantization is much more noticeable around Q4 and below. FP16->Q8 on consumer hardware is 2x the speed at ~99.99% the quality.

rvba 16 hours ago
Any source that confirms the 99.99% quality?
- Catloafdev 5 minutes ago
  
  I don't have a 'source' off-hand but I recommend reading up on it if you want to learn more. A lot of models on HF show a card demonstrating the different quality trade-offs between quants.