Comment by packetlost

18 hours ago

Nah, I model hop constantly as I work with serving GLM and Kimi models and they're not nearly as good as Opus 4.5+ and GPT 5.2+ and it's not particularly close. They're good by standards set a generation or two ago, but they're really not competitive with where the frontier models are at now.

5 comments

packetlost

zozbot234 17 hours ago

They compete with "mini" or "nano" model classes quite well given the price of inference. You'd need to "model hop" anyway, using Opus for everything is quite wasteful.

packetlost 16 hours ago
Now those aren't really "frontier models" now, are they.
- zozbot234 16 hours ago
  
  They are on the frontier of local models, where the game is often to get the best bang for the buck. You can always scale model size and compute (Mythos, GPT Pro, Gemini DeepThink) to reach better outcomes, but that's not a very interesting strategy.
  
  1 reply →

mariopt 17 hours ago

Guess it really depends on what you use them for. I've been able to built whole apps with them, not slop. Kimi is quite good at design, for 3D, I noticed Gemini 3.1 is excellent for basic to medium use cases.

I've tried both Opus and GPT 5.4, they also hallucinate just like the rest at a much higher cost.

The more you use a model overtime, the better you become with it. It's really hard to measure, my main metric lately has been tokens per second/time to complete task.

At this point I've the feeling frontier models are optimizing for benchmarks and one shot prompts.