← Back to context Comment by bityard 1 day ago Halving the precision of the weights is not a free lunch... 2 comments bityard Reply Catloafdev 21 hours ago Q8 is virtually lossless. The quantization is much more noticeable around Q4 and below. FP16->Q8 on consumer hardware is 2x the speed at ~99.99% the quality. rvba 12 hours ago Any source that confirms the 99.99% quality?
Catloafdev 21 hours ago Q8 is virtually lossless. The quantization is much more noticeable around Q4 and below. FP16->Q8 on consumer hardware is 2x the speed at ~99.99% the quality. rvba 12 hours ago Any source that confirms the 99.99% quality?
Q8 is virtually lossless. The quantization is much more noticeable around Q4 and below. FP16->Q8 on consumer hardware is 2x the speed at ~99.99% the quality.
Any source that confirms the 99.99% quality?