← Back to context

Comment by satvikpendem

16 hours ago

Qwen 3.6 27B dense is much better than the 35B MoE model for coding, not sure if you've tried that yet.

6 comments

satvikpendem

Reply

sheeshkebab 3 hours ago

27b is slow as molasses vs 35b on local stuff I have (m5 max). Mtp doesn’t make any difference either.

walrus01 16 hours ago

yes, I have, I use both. 27B slower in tok/s due to density, obviously, 35B-A3B for speed on simpler tasks.

intothemild 9 hours ago
You should enable MTP now that its available.
LLamaCPP has had some massive updates in the last week or so.
- npodbielski 5 hours ago
  
  Yes, Qwen 3.6 MoE is hitting like 80-90tk/s on Strix halo. On R9700 I had like 170t/s. It was not possible to keep up. But MoE is circling very often. I switch then to dense model and have 20-30t/s but it is able to solve quite a lot of tasks.
  
  2 replies →