Comment by irthomasthomas

8 days ago

If these are FP4 like the other ollama models then I'm not very interested. If I'm using an API anyway I'd rather use the full weights.

5 comments

irthomasthomas

Reply

mchiang 8 days ago

OpenAI has only provided MXFP4 weights. These are the same weights used by other cloud providers.

irthomasthomas 8 days ago
Oh, I didn't know that. Weird!
- reissbaker 8 days ago
  
  It was natively trained in FP4. Probably both to reduce VRAM usage at inference time (fits on a single H100), and to allow better utilization of B200s (which are especially fast for FP4).
  
  2 replies →