Comment by AMeckes
2 days ago
Good points! any-llm handles the LLM routing, but you can still put it behind your own proxy for centralized control. We just don't force that architectural decision on you. Think of it as composable: use any-llm for provider switching, add nginx/envoy/whatever for rate limiting if you need it.
How do I put this behind a proxy? You mean run the module as a containerized service?
But provider switching is built in some of these - and the folks behind envoy built: https://github.com/katanemo/archgw - developers can use an OpenAI client to call any model, offers preference-aligned intelligent routing to LLMs based on usage scenarios that developers can define, and acts as an edge proxy too.
To clarify: any-llm is just a Python library you import, not a service to run. When I said "put it behind a proxy," I meant your app (which imports any-llm) can run behind a normal proxy setup.
You're right that archgw handles routing at the infrastructure level, which is perfect for centralized control. any-llm simply gives you the option to handle routing in your application code when that makes sense (For example, premium users get Opus-4). We leave the architectural choice to you, whether that's adding a proxy, keeping routing in your app, or using both, or just using any-llm directly.
But you can also use tokens to implement routing decisions in a proxy. You can make RBAC natively available to all agents outside code. The incremental feature work in code vs an out of process server is the trade off. One gets you going super fast the other offers a design choice that (I think) scales a lot better