Comment by jjice

1 day ago

Ollama is a more out of the box solution. I also prefer llama.cpp for the more FOSS aspects, but Ollama is a simpler install, model download (this is the biggest convenience IMO), and execution. For those reasons, that's why I believe it's still fairly popular as a solution.

5 comments

jjice

dd3boh 1 day ago

By the way, you can download models straight from hugging face with llama.cpp. It might be a few characters longer than the command you would run on ollama, but still.

brabel 1 day ago
Then you need to also provide appropriate metadata and format messages correctly according to the format. Which I believe llama.cpp doesn’t do by default, or it can do it? I had trouble formatting messages correctly using llama.cpp due to possibly mismatch in metadata, which ollama seems to handle, but would love to know if this is wrong.
- dpkirchner 1 day ago
  
  Plus a huggingface token to access models that require you to beg for approval. Ollama hosted models don't require that (which may not be legit but most users don't care).
loudmax 21 hours ago

You can, but you have to know where to look, and you have to have some idea of what you're doing. The benefit of Ollama is that the barrier to entry is really low, as long as you have the right hardware.
To me, one of the benefits of running a model locally is learning how all this stuff works, so Ollama never had any appeal. But most people just want stuff to work without putting in the effort to understand how it all fits together. Ollama meets that demand.

buyucu 1 day ago

I disagree that Ollama is easier to install. I tried to enable Vulkan on Ollama and it is nightmarish, even though the underlying llama.cpp code supports it with a simple envar. Ollama was easy 2 years ago, but has been progressively getting worse over time.