← Back to context

Comment by satvikpendem

16 hours ago

Qwen 3.6 27B dense is much better than the 35B MoE model for coding, not sure if you've tried that yet.

27b is slow as molasses vs 35b on local stuff I have (m5 max). Mtp doesn’t make any difference either.

yes, I have, I use both. 27B slower in tok/s due to density, obviously, 35B-A3B for speed on simpler tasks.

  • You should enable MTP now that its available.

    LLamaCPP has had some massive updates in the last week or so.

    • Yes, Qwen 3.6 MoE is hitting like 80-90tk/s on Strix halo. On R9700 I had like 170t/s. It was not possible to keep up. But MoE is circling very often. I switch then to dense model and have 20-30t/s but it is able to solve quite a lot of tasks.

      2 replies →