Comment by kevingadd

6 months ago

Almost every mention I've seen of gpt-oss was a complaint that the training on synthetic datasets produced a model that's mostly good at benchmarks. Are benchmarks the great results you're referring to or are there a lot of satisfied users out there that just don't post here on HN? Genuinely curious.

I can see how performing well on benchmarks at the expense of everything else counts as great results if that's the point of the model.