Show HN: LLMRouter – Stop using GPT-4/o1 for everything (16 routing strategies)

4 hours ago (github.com)

OP here. I'm a CS PhD student at UIUC working on User Modeling and Applied ML.

We built LLMRouter because we noticed a gap in the current LLM stack: everyone knows we shouldn't route every query to GPT-4/o1 (it's slow and expensive), but building a reliable router that handles context, reasoning, and user history is surprisingly hard.

Most existing solutions are either simple regex/keyword matching or closed-source APIs. We wanted to build a standard, open-source library that unifies the SOTA.

What LLMRouter actually does: It provides a unified interface to 16+ routing strategies, ranging from lightweight ML to heavy reasoning agents:

Single-Round: Classification-based (KNN, SVM, BERT) and Embedding-based methods.

Multi-Round & Agentic: Routers that "think" before assigning models (CoT reasoning) or break down tasks step-by-step.

Personalized Routing: This is a key focus of our research. The router learns from user interaction history to fit individual preferences (e.g., some users prefer concise answers from faster models, others need detailed reasoning).

The Pipeline: We didn't just ship the model weights. The library includes:

Data Generation: A pipeline to generate synthetic routing data for your specific domain.

Benchmarks: 11 datasets to evaluate router performance.

Deployment: A CLI and Gradio UI to visualize routing decisions in real-time.

In our experiments, we typically see 30–50% cost reduction while maintaining response quality by correctly identifying easy vs. hard queries.

The code is open source (MIT/Apache): https://github.com/ulab-uiuc/LLMRouter

Happy to answer any questions about the implementation details or the specific RL/Ranking algorithms we used!