← Back to context

Comment by Catloafdev

1 day ago

Nobody runs unquantized, there's literally no reason to. Q8 would be the largest anyone actually runs on consumer hardware for inference.

4 comments

Catloafdev

Reply

bityard 1 day ago

Halving the precision of the weights is not a free lunch...

Catloafdev 21 hours ago
Q8 is virtually lossless. The quantization is much more noticeable around Q4 and below. FP16->Q8 on consumer hardware is 2x the speed at ~99.99% the quality.
- rvba 12 hours ago
  
  Any source that confirms the 99.99% quality?