Comment by gnulinux

6 months ago

At $2/1Mt it's cheaper than e.g. Gemini 2.5 Pro which is ($1.25/1Mt for input and $10/1Mt per output). When I code with Aider my requests average to something like 5000 tokens input and 800 tokens output. At this rate, Gemini 2.5 Pro is about $0.01425 per single Aider request and Cerebras Qwen3 Coder is $0.0116 per request. Not a significant difference, but I think sufficiently cheaper to be competitive, especially given Qwen3-coder is on part with Gemini/Claude/o3, it even surpasses them in some tests.

NOTE: Currently in OpenRouter, Qwen3-Coder requests are averaging to $0.3/1M input tok and $1.2/1M output tok. That's just so significantly cheaper that I wouldn't be surprised if open weight models start eating Google/Anthropic/OpenAI lunch. https://openrouter.ai/qwen/qwen3-coder

2 comments

gnulinux

pkaye 6 months ago

Do you have any experience on how good is Qwen3-coder compared to Claude 4 Sonnet?

gnulinux 6 months ago

No, unfortunately, I haven't used Qwen3-coder yet. I do like Claude 4 Sonnet, but my favorite programming LLM is Gemini 2.5 Pro at the moment, I think it's the smartest model (Claude and o3 do print better code though).
I have exprience using the base Qwen3-32B model and it's extremely good for its size, especially in solving undergrad/grad level math problems. So my guess would be that Qwen3-coder should be competitive, but this is just speculation.