Comment by cmrdporcupine

9 hours ago

A small model can be made to be "comparable to Opus" in some narrow domains, and that's what they've done here.

But when actually employed to write code they will fall over when they leave that specific domain.

Basically they might have skill but lack wisdom. Certainly at this size they will lack anywhere close to the same contextual knowledge.

Still these things could be useful in the context of more specialized tooling, or in a harness that heavily prompts in the right direction, or as a subagent for a "wiser" larger model that directs all the planning and reviews results.