Comment by piperswe

2 days ago

A parameter can be any size float. Lots of downloadable models are FP8 (8 bits per parameter), but it appears this model is FP16 (16 bits per parameter)

Often, the training is done in FP16 then quantized down to FP8 or FP4 for distribution.

4 comments

piperswe

dragonwriter 1 day ago

I think they are bfloat16, not FP16, but they are both 16bpw formats, so it doesn't make a size difference.

iyn 1 day ago

Wiki article on bfloat16 for reference, since it was new to me: https://en.wikipedia.org/wiki/Bfloat16_floating-point_format
Tokumei-no-hito 1 day ago
pardon the ignorance but it's the first time I've heard of bfloat16.
i asked chat for an explanation and it said bfloat has a higher range (like fp32) but less precision.
what does that mean for image generation and why was bfloat chosen over fp?
- dragonwriter 1 day ago
  
  My fuzzy understanding, and I'm not at all an expert on this, that the main benefit is that bf16 is less prone to overflow/underflow during calculation, which is a source of bigger problems in both training and inference than the simple loss of precision, so once it became widely supported, it became a commonly-preferred format for models (whether image gen or otherwise) over FP16.