Comment by girvo
21 hours ago
Gemma is better than Qwen at everything except coding, in all my evaluations. Which is a shame because that is what I use them for!
21 hours ago
Gemma is better than Qwen at everything except coding, in all my evaluations. Which is a shame because that is what I use them for!
I have a M1 Macbook Pro...with only 16gb and I struggled with Qwens2.5-14b trying to do large projects. I loved Qwen but I had to try and do something different. So I switched to Gemma4-12b which looking at it now, seems more like a downgrade than an upgrade.Can you refer me to any Qwen coding models that wont choke my poor 16gb and also connect contextually? I need that context. I love the laser point focus, but I need context and basic understanding of that context.
I haven't run a proper eval, but I've been getting better luck with Qwen models than Gemma on plant and animal identification using vision.
I do like Gemma for translation, however.
It would be great if the Gemma folks would release a code-focused model. Probably won't happen, but it's fun to dream.
The Ornith folks say they're doing that, but haven't released the Gemma-based 31b yet (https://github.com/deepreinforce-ai/Ornith-1). But, also, the Qwen-based 35b MoE Ornith version performs worse than Qwen 3.6 and Qwen AgentWorld on my benchmarks (which are focused on finding security bugs, so not exactly the same as agentic coding, but closely related skills).
That said, the reason they're able to release Ornith branded post-trains of both Gemma and Qwen is because they're open weights under a friendly license. Someone, not just Google, could make a coding focused Gemma post-train. I don't think it's actually much weaker than Qwen 3.6 for coding; Gemma 4 31b outperforms Qwen 3.6 27b by a wide margin on security bug hunting (at least for the specific bugs in my benchmarks, which are mostly relatively difficult bugs from the Mythos-reported bugs).
I'd really love to see a bigger MoE from Google, though. A 70b or 120b MoE would likely be super fun.
Ya, doesn't seem to be google's focus at all, right?
gemma is also worse for tool calling. not just coding
That is because they use a different tool calling format than most other models. Unsloth quants fix this in their Gemma releases.
I've never been able to fix the tool calling issues. Running unsloth versions with llama.cpp, constant issues. Have tried many forum fixes, including lots of fixed chat templates, to no avail. It's mostly the edit call that breaks, which often results in "let me just rewrite the whole file from context".
Can you say a bit more about this? The bad tool calling has made me give up on using Gemma for my Hermes and a personal recipe site. I have only downloaded from Ollama.
1 reply →