Comment by gertlabs

9 hours ago

This is true, we have the numbers to back it up on https://gertlabs.com/rankings?mode=oneshot_coding (check out the efficiency chart too)

GPT 5.5/5.4 are the smartest models, but at great token / code bloat cost. Qwen 3.6 Max strikes a good balance. But Gemma 4 26B writes some really efficient code, with great results considering the model size. Things do start falling apart under higher contexts.

1 comment

gertlabs

mark_l_watson 2 hours ago

I have been experimenting with using Claude Code with both the qwen3.6 31B MOE and 28B dense models. Yesterday the 31B model once got confused on refactoring some Prolog code and took a very long time to get it right. Functionality for coding or refactoring Python or TypeScript is usually good. but runs slowly on my 32B MacMini.

Ollama has initial support for bf16 MTP Gemma 4 https://ollama.com/library/gemma4:31b-coding-mtp-bf16 but I have to wait for a smaller model.

I understand why people get excited by having the strongest AI to play/work with but economic factors of inference really count also.