Comment by adchurch
4 days ago
I think the key detail here is that we use embeddings of the prompt + previous context in order to decide where to route the request, and if one model is getting stuck we can course correct and move to a different model.
So: we can reasonably cluster similar problems together and learn how models handle them, and the entire system doesn't fail if the initial decision is off.
No comments yet
Contribute on Hacker News ↗