Comment by 0xbadcafebee
2 months ago
I still use llama-swap as its configuration allows me to tailor llama.cpp settings per-model, set autoload timeouts, web ui log viewing, lots of great metrics, and load/unload model on a click. Llama-swap also technically lets you port-forward to some other app/service, like a remote service.
No comments yet
Contribute on Hacker News ↗