Comment by K0balt

1 month ago

What is your opinion on qwen 35b MOEvs qwen 27b dense?

2 comments

K0balt

Maybe a skill issue but they both feel about the same and the MoE is 3x faster so I barely use the dense model.

latable 1 month ago

Not the person asked but on a medium bug that would span a few python files, I found the MOE be too enthusiastic trying things without trying to understand first the issue, when the dense model though hard and added debug statements to understand how to fix it. But the dense model is quite slow (Q4KM quant, MI50 32GB, llama.cpp, pi)