← Back to context Comment by theanonymousone 3 days ago But the huggingface link mentions BF16, F16, and I32? 2 comments theanonymousone Reply kouteiheika 3 days ago Not every weight is quantized. For example, those weights which don't take much space or are highly important are left in higher precision. State-of-art quantization of weights is never done uniformly (i.e. to all weights and in the same way). zackangelo 3 days ago I don't believe safetensors has a native int4 dtype, so they packed 4 int4s into a bf16 in this checkpoint.
kouteiheika 3 days ago Not every weight is quantized. For example, those weights which don't take much space or are highly important are left in higher precision. State-of-art quantization of weights is never done uniformly (i.e. to all weights and in the same way).
zackangelo 3 days ago I don't believe safetensors has a native int4 dtype, so they packed 4 int4s into a bf16 in this checkpoint.
Not every weight is quantized. For example, those weights which don't take much space or are highly important are left in higher precision. State-of-art quantization of weights is never done uniformly (i.e. to all weights and in the same way).
I don't believe safetensors has a native int4 dtype, so they packed 4 int4s into a bf16 in this checkpoint.