Comment by jug

1 day ago

> The Qwen3-Next-80B-A3B-Instruct performs comparably to our flagship model Qwen3-235B-A22B-Instruct-2507

I'm skeptical about these claims. How can this be? Wouldn't there be massive loss of world knowledge? I'm particularly skeptical because a recent trend in Q2 2025 has been benchmaxxing.

4 comments

jug

dragonwriter 1 day ago

> I'm skeptical about these claims. How can this be?

More efficient architecture.

> Wouldn't there be massive loss of world knowledge?

If you assume equally efficient architecture and no other salient differences, yes, that’s what you’d expect from a smaller model.

jug 1 day ago
Hmm. Let's just say if this is true, that this is actually better with such a much lower total parameter count, it's the greatest accomplishment in over a year of LLM development. With the backdrop of bechmaxxing in 2025, I'll believe in this when I see the results on closed benchmarks and SimpleBench. My concern is this might be a hallucination machine.
- KaoruAoiShiho 1 day ago
  
  In my testing this model is quite bad and far behind 235b a22b. https://fiction.live/stories/Fiction-liveBench-Sept-12-2025/...
- bigyabai 1 day ago
  
  Might be. FWIW, my experience with the Qwen3 30b model basically took ChatGPT out of rotation for me. It's not hard for me to imagine an 80b model pushing that further, especially with thinking enabled.
  I recommend playing with the free hosted models to draw your own conclusions: https://chat.qwen.ai/