Comment by lostmsu

12 days ago

AFAIK gpt-oss-20b on high reasoning has SWE score of just over 60. It is smaller than all comparable models. Maybe I am missing something, but it is still state of the art all the way up to 50B parameters vs all models released after.

At least https://huggingface.co/facebook/cwm team had balls comparing to it directly (sort of, see TTS).

What does this model do that gpt-oss-20b does not? AFAIU the base model it was finetuned from is not reproducible, and if I flip a single bit in gpt-oss-20b and tell you how (instruction under MIT) that would satisfy "fully open finetuning" they claim as advantage. But that "open" fine-tuned gpt-oss-20b is probably going to beat their model.