Comment by SkyPuncher
6 months ago
Ollama - `brew install ollama`
llama.cpp - Read the docs, with loads of information and unclear use cases. Question if it has API compatibility and secondary features that a bunch of tools expect. Decide it's not worth your effort when `ollama` is already running by the time you've read the docs
https://formulae.brew.sh/formula/llama.cpp
Additionally, Ollama makes model installation a single command. With llama.cpp, you have to download the raw models from Huggingface and handle storage for them yourself.
Not really, llama.cpp can download for quite some time, not as elegant as ollama but:
Will get you up and running in one single command.
And now you need a server per model? Ollama loads models on-demand, and terminates them after idle, all accessible over the same HTTP API.
ollama run deepseek-r1:14b