Comment by jychang

6 hours ago

There's almost 0% chance that OpenAI doesn't quantize the model right off the bat.

I am willing to bet large amounts of money that OpenAI would never release a model served as fully BF16 in the year of our lord 2026. That would be insane operationally. They're almost certainly doing QAT to FP4 for FFN, and a similar or slightly larger quant for attention tensors.

2 comments

jychang

selcuka 6 hours ago

It's ok if they never release a BF16 model, but it's less ok if they release it, win the benchmarks, then quantise it after a few weeks.

retinaros 2 hours ago

that is for sure what everyone does. also they train on evals with the datasets that they would be bench against.