Comment by simonw
1 day ago
I'm expecting (and indeed hoping) that the open weights OpenAI model is a lot smaller than K2. K2 is 1 trillion parameters and almost a terabyte to download! There's no way I'm running that on my laptop.
I think the sweet spot for local models may be around the 20B size - that's Mistral Small 3.x and some of the Gemma 3 models. They're very capable and run in less than 32GB of RAM.
I really hope OpenAI put one out in that weight class, personally.
Early rumours (from a hosting company that apparently got early access) was that you'd need "multiple h100s to run it", so I doubt it's a gemma - mistral small tier model..
I think you're right, I've seen a couple of other comments now that indicate the same thing.
You will get at 20gb model. Distillation is so compute efficient that it’s all but inevitable that if not OpenAI, numerous other companies will do it.
I would rather have an open weights model that’s the best possible one I can run and fine tune myself, allowing me to exceed SOTA models on the narrower domain my customers care about.