Comment by originalvichy

8 hours ago

For at least a year now, it has been clear that data quality and fine-tuning are the main sources of improvement for mediym-level models. Size != quality for specialized, narrow use cases such as coding.

It’s not a surprise that models are leapfrogging each other when the engineers are able to incorporate better code examples and reasoning traces, which in turn bring higher quality outputs.

If all you're looking at is benchmarks that might be true, but those are way too easy to game. Try using this model alongside Opus for some work in Rust/C++ and it'll be night and day. You really can't compare a model that's got trillions of parameters to a 27B one.

  • > ...and it'll be night and day.

    That's just, like, your opinion, man.

    > You really can't compare a model that's got trillions of parameters to a 27B one.

    Parameter count doesn't matter much when coding. You don't need in-depth general knowledge or multilingual support in a coding model.

    • I often do need in-depth general knowledge in my coding model so that I don't have to explain domain specific logic to it every time and so that it can have some sense of good UX.