Comment by margalabargala
8 hours ago
Wonder where it falls on the Sonnet 3.7/4.0/4.5 continuum.
3.7 was not all that great. 4 was decent for specific things, especially self contained stuff like tests, but couldn't do a good job with more complex work. 4.5 is now excellent at many things.
If it's around the perf of 3.7, that's interesting but not amazing. If it's around 4, that's useful.
I still have yet to find a "Small" model that can use function calls consistently enough to not be frustrating. That is the most noticeable difference I consistently see between even older "SOTA" models and the best performing "SMALL" models (<70b).