← Back to context

Comment by mariopt

17 hours ago

True, many people don't know GLM 5.1 and Kimi 2.6, really on par with frontier models. There's also Minimax 2.7, DeepSeek 4, Qwen, Xiaomi 2.5 Pro, etc.

China is leading in open source frontier models, so I don't really see how the US wins this one. At some point, companies and people will start running their own models in the cloud and locally, Chinese models will be everywhere.

Nah, I model hop constantly as I work with serving GLM and Kimi models and they're not nearly as good as Opus 4.5+ and GPT 5.2+ and it's not particularly close. They're good by standards set a generation or two ago, but they're really not competitive with where the frontier models are at now.

  • Guess it really depends on what you use them for. I've been able to built whole apps with them, not slop. Kimi is quite good at design, for 3D, I noticed Gemini 3.1 is excellent for basic to medium use cases.

    I've tried both Opus and GPT 5.4, they also hallucinate just like the rest at a much higher cost.

    The more you use a model overtime, the better you become with it. It's really hard to measure, my main metric lately has been tokens per second/time to complete task.

    At this point I've the feeling frontier models are optimizing for benchmarks and one shot prompts.

If you actually use them you'll see that they are far from frontier models. They are much more cost-effective for what they are, but frontier they are not.