Comment by ojosilva

2 days ago

Yeah, I'm just going through the Cerebras migration at the moment.

It's a shame Cerebras completely dropped Qwen3 Coder's fast tool calling, short and instant responses, and better speed overall for GLM 4.6 thinking. Qwen3 is really good at hitting the tools first, then coming up with a well-grounded answer based on reality. Sometimes it's good when a model is Socratic: just knows it knows nothing.

GLM 4.6 on the other hand is more self-sufficient and if it sees it, and knows it, it thinks and thinks and finally just fixes it in one or two shots, so when you hit the jackpot, it probably an improvement over Q3C. But when it does not get it right, it digs itself into a hole larger than the Olympus Mons.

2 comments

ojosilva

KronisLV 1 day ago

> Qwen3 is really good at hitting the tools first, then coming up with a well-grounded answer based on reality.

I don't know, I had a lot of issues with Qwen models when it comes to RooCode/Cline - failed edits (albeit with a requirement for 100% precision, since I don't want the wrong lines to be replaced) or calling tools without parameters (e.g. list_files without path) and also stuff like using wrong path separators or using the wrong commands for the shell that's available (e.g. cmd when Git Bash is the shell).

GLM 4.6 seems better in that regard so far, maybe the coming weeks and months will show that better.

ojosilva 14 hours ago

I've used it with CC and the match was great, not a lot of issues, I believe Qwen had a clear focus on distilling Anthropic models. GLM 4.6 is slightly better maybe, but the speed dropped to half on Cerebras so that's the price for maybe ~15% improvement in model overall quality. This quality does not necessarily means the end product (the code) is 15% better, just that now I take 12 turns with GLM instead of 15 turns with Qwen to get something done, but turn speed has been reduced to half in Cerebras, so my TTC (time-to-completion) has actually gone from 15min to 24min!