Comment by ActorNightly

19 hours ago

Qwen is still better that Gemma though. Also you can tune it more for different tasks, which means that you can prioritize thinking and accuracy versus inference speed.

9 comments

ActorNightly

SwellJoe 17 hours ago

Qwen is better at some things (code, in particular), but Gemma has better prose and better vision. At least, it feels that way to me.

zobzu 16 hours ago
gemma is also just way faster. i dont wanna wait 10min to get a 5-10% better answer (and sometimes, actually worse answer).
best is to use your own model router atm, depending on the task
- SwellJoe 16 hours ago
  
  I'm pretty sure Qwen is faster? The MoE version of Qwen is 3B active, while Gemma 4 is 4B active. Similarly, the dense Qwen is 27B while Gemma is 31B. All else being equal (though I know all else isn't equal), Qwen should be faster in both cases. I haven't actually measured with any precision, but on my AMD hardware (Strix Halo or dual Radeon Pro V620) they seem quite similar in both cases...both MoE models are fast enough for interactive use, both dense models are notably smarter but much slower, long time to first response and single-digit tokens per second once it starts talking.
  
  2 replies →

MikeTheGreat 16 hours ago

Genuine question: how do you tune it?

I thought "fine-tuning" meant training it on additional data to add additional facts / knowledge? I might be mistaking your use of the word "tune", though :)

dr_kiszonka 7 hours ago

You can fine-tune relatively easily in Unsloth Studio.

redman25 16 hours ago

It’s a heck of a lot faster too.

2ndorderthought 17 hours ago

Yes I would just go with qwen.