Comment by nodesocket

3 days ago

This is awesome, will give it a try tonight.

I’ve been looking for something a bit different though related to Ollama. I’d like a load balancing reverse proxy that supports queuing requests to multiple Ollama servers and sending requests only when a Ollama server is up and idle (not processing). Anything exist?