Comment by hobofan
19 hours ago
That's almost the most simple kind of router imaginable, isn't it? Just embed the query and route to the model that in the past has performed the best on similar queries?
I'm sure that has been documented/tried before, and this almost certainly doesn't work in practice. The typical counter-example would be to take a simple-sounding query that actually requires complex reasoning, but because the query is close in the embedding space to other simple-sounding queries, it would be sent to a "dumber model" for efficency.
I guess in their benchmarks that works out, because from what it sounds like, they do per-dataset clustering, so the embedding clusters may actually be able to cluster "complexity levels". However, if you were to mix all datasets into one (similar to how you would encounter it for most real-world use-cases) and cluster against that, this approach would surely break down.
No comments yet
Contribute on Hacker News ↗