Comment by prodigycorp

2 months ago

Yeah. 3.7 was pretty bad. I remember its warts vividly. It wanted to refactor everything. Not a great model on which to hinge this provocation.

But skills do improve model performance, OpenAI posted some examples of how it massively juiced up their results on some benchmarks.