Comment by fzysingularity
7 months ago
If I had to guess, the OpenAI open-source model got delayed because Kimi K2 stole their thunder and beat their numbers.
7 months ago
If I had to guess, the OpenAI open-source model got delayed because Kimi K2 stole their thunder and beat their numbers.
Someone at openai did say it was too big to host at home, so you could be right. They will probably be benchmaxxing, right now, searching for a few evals they can beat.
These are all "too big to host at home". I don't think that is the issue here.
https://github.com/MoonshotAI/Kimi-K2/blob/main/docs/deploy_...
"The smallest deployment unit for Kimi-K2 FP8 weights with 128k seqlen on mainstream H200 or H20 platform is a cluster with 16 GPUs with either Tensor Parallel (TP) or "data parallel + expert parallel" (DP+EP)."
16 GPUs costing ~$30k each. No one is running a ~$500k server at home.
For most people, before it makes sense to just buy all the hardware yourself, you probably should be renting GPUs by the hour from the various providers serving that need. On Modal, I think should cost about $72/hr to serve Kimi K2 https://modal.com/pricing
Once that's running it can serve the needs of many users/clients simultaneously. It'd be too expensive and underutilized for almost any individual to use regularly, but it's not unreasonable for them to do it in short intervals just to play around with it. And it might actually be reasonable for a small number of students or coworkers to share a $70/hr deployment for ~40hr/week in a lot of cases; in other cases, that $70/hr expense could be shared across a large number of coworkers or product users if they use it somewhat infrequently.
So maybe you won't host it at home, but it's actually quite feasible to self-host, and is it ever really worth physically hosting anything at home except as a hobby?
2 replies →
I think what GP means is that because the (hopefully) pending OpenAI release is also "too big to run at home", these two models may be close enough in size that they seem more directly comparable, meaning that it's even more important for OpenAI to outperform Kimi K2 on some key benchmarks.
1 reply →
This is a dumb question I know, but how expensive is model distillation? How much training hardware do you need to take something like this and create a 7B and 12B version for consumer hardware?
2 replies →
The real users for these open source models are businesses that want something on premises for data privacy reasons
Not sure if they’ll trust a Chinese model but dropping $50-100k for a quantized model that replaces, say, 10 paralegals is good enough for a law firm
4 replies →
According to the benchmarks, Kimi K2 beats GPT-4.1 in many ways. So to "compete", OpenAI would have to release the GPT-4.1 weights, or a similar model. Which, I guess, they likely won't do.