Comment by dwood_dev
2 days ago
It seems to me like it would be a rather useful exercise to have the smaller model make the routing decision, and below certain confidence thresholds, it sends it to a larger model anyways. Then have the larger model evaluate that choice and perhaps refine instructions.
That's a cleaner implementation than what I described. Small model as meta-router: classify locally, escalate only when confidence is low. The self-evaluation loop you're suggesting would add a quality layer without much overhead — the large model's judgment of its own routing is itself a useful signal. Haven't shipped that yet but it's on the list.