Comment by ekianjo
10 hours ago
I guess the parallel is "Ollama serve" which provides you with a direct REST API to interact with a LLM.
10 hours ago
I guess the parallel is "Ollama serve" which provides you with a direct REST API to interact with a LLM.
llama-cpp provides an API server as well via llama-server (and a competent webgui too).