Comment by yieldcrv
14 hours ago
> Almost every model used the canonical provider: Zai for GLM, Deepseek for Deepseek, etc.
> I am never touching Minimax or GLM again. Their APIs had constant outages
Goofy take
You run these on a VPS based on the architecture of that VPS provider, or on your own cluster
Sorry I don't understand, you're saying the direct providers aren't the canonical source you'd recommend?
If I was running these on my own machine or GPU wouldn't the argument then be "Well you didn't use the real providers?"
For the record I started doing this approach because the Kimi team released this which was shocking to me: https://github.com/MoonshotAI/K2-Vendor-Verifier
GLM 5.1's smallest model size is 206 GB and really you're probably wanting to run a version that's ~400GB. If you want it to be performant, you're not just running it on a VPS.
And just saying "run it on your own cluster" sort of glosses over the cost of such a cluster.