Comment by throwaw12

13 days ago

Aghhh, I wished they release a model which outperforms Opus 4.5 in agentic coding in my earlier comments, seems I should wait more. But I am hopeful

15 comments

throwaw12

wyldfire 13 days ago

By the time they release something that outperforms Opus 4.5, Opus 5.2 will have been released which will probably be the new state-of-the-art.

But these open weight models are tremendously valuable contributions regardless.

wqaatwt 13 days ago

Qwen 3 Max wasn’t originally open, or did they realease?

frankc 13 days ago

One of the ways the chinese companies are keeping up is by training the models on the outputs of the American fronteir models. I'm not saying they don't innovate in other ways, but this is part of how they caught up quickly. However, it pretty much means they are always going to lag.

Onavo 13 days ago

Does the model collapse proof still hold water these days?
CuriouslyC 13 days ago

Not true, for one very simple reason. AI model capabilities are spiky. Chinese models can SFT off American frontier outputs and use them for LLM-as-judge RL as you note, but if they choose to RL on top of that with a different capability than western labs, they'll be better at that thing (while being worse at the things they don't RL on).
aurareturn 13 days ago
They are. There is no way to lead unless China has access to as much compute power.
- jyscao 12 days ago
  
  They likely will lead in compute power in the medium term future, since they’re definitely the country with the highest energy generation capacity at this point. Now they just need to catch up on the hardware front, which I believe they’ve also made significant progress on over the last few years.
  
  1 reply →
MaxPock 12 days ago

If that's how it is done, we'd have very many models from all manner of countries. I mean ,how difficult is distillation for India , Japan and EU ?

WarmWash 13 days ago

The Chinese just distill western SOTA models to level up their models, because they are badly compute constrained.

If you were pulling someone much weaker than you behind yourself in a race, they would be right on your heels, but also not really a threat. Unless they can figure out a more efficient way to run before you do.

esafak 12 days ago

But it is a threat when the performance difference is not worth the cost in the customers' eyes.

OGEnthusiast 13 days ago

Check out the GLM models, they are excellent

khimaros 13 days ago

Minimax m2.1 rivals GLM 4.7 and fits in 128GB with 100k context at 3bit quantization.

auspiv 13 days ago

There have been a couple "studies" and comparing various frontier-tier AIs that have led to the conclusion that Chinese models are somewhere around 7-9 months behind US models. Other comment says that Opus will be at 5.2 by the time Qwen matches Opus 4.5. It's accurate, and there is some data to show by how much.

lofaszvanitt 13 days ago

Like these benchmarks mean anything.