Comment by easygenes

2 days ago

Other benchmark aggregates are less favorable to GPT-OSS-120B: https://arxiv.org/abs/2508.12461

5 comments

easygenes

With all these things, it depends on your own eval suite. gpt-oss-120b works as well as o4-mini over my evals, which means I can run it via OpenRouter on Cerebras where it's SO DAMN FAST and like 1/5th the price of o4-mini.

indigodaddy 2 days ago
How would you compare gpt-oss-120b to (for coding):
Qwen3-Coder-480B-A35B-Instruct
GLM4.5 Air
Kimi K2
DeepSeek V3 0324 / R1 0528
GPT-5 Mini
Thanks for any feedback!
- petesergeant 2 days ago
  
  I’m afraid I don’t use any of those for coding
  
  2 replies →