Comment by SkyPuncher

6 months ago

Ollama - `brew install ollama`

llama.cpp - Read the docs, with loads of information and unclear use cases. Question if it has API compatibility and secondary features that a bunch of tools expect. Decide it's not worth your effort when `ollama` is already running by the time you've read the docs

5 comments

SkyPuncher

kgwgk 6 months ago

https://formulae.brew.sh/formula/llama.cpp

LorenDB 6 months ago

Additionally, Ollama makes model installation a single command. With llama.cpp, you have to download the raw models from Huggingface and handle storage for them yourself.

trissi1996 6 months ago
Not really, llama.cpp can download for quite some time, not as elegant as ollama but:
llama-server --model-url "https://huggingface.co/bartowski/DeepSeek-R1-Distill-Qwen-32B-GGUF/resolve/main/DeepSeek-R1-Distill-Qwen-32B-IQ4_XS.gguf"
Will get you up and running in one single command.
- yencabulator 6 months ago
  
  And now you need a server per model? Ollama loads models on-demand, and terminates them after idle, all accessible over the same HTTP API.

singularity2001 6 months ago

ollama run deepseek-r1:14b