← Back to context

Comment by llm_trw

6 months ago

Can someone explain what the point of ollama is?

Every time I look at it, it seems like it's a worse llama.cpp that removes options to make things "easier".

Open-weights LLMs provide a dizzying array of options.

You'd have Llama, Mistral, Gemma, Phi, Yi.

You'd have Llama, Llama 2, Llama 3, Llama 3.2...

And those offer with 8B, 13B or 70B parameters

And you can get it quantised to GGUF, AWQ, exl2...

And quantised to 2, 3, 4, 6 or 8 bits.

And that 4-bit quant is available as Q4_0, Q4_K_S, Q4_K_M...

And on top of that there are a load of fine-tunes that score better on some benchmarks.

Sometimes a model is split into 30 files and you need all 30, other times there's 15 different quants in the same release and you only need a single one. And you have to download from huggingface and put the files in the right place yourself.

ollama takes a lot of that complexity and hides it. You run "ollama run llama3.1" and the selection and download all gets taken care of.

Not to be snide, but removing options to make things easier has been wildly successful in a variety of project/products.

Ollama : llama.cpp :: Dropbox : rsync

  • Not sure this is a good analogy. LM Studio is closer to Dropbox as both takes X and makes it easier for users who don't necessarily are very technical. Ollama is a developer-oriented tool (used via terminal + a daemon), so wouldn't compare it to what Dropbox is/did for file syncing.

It’s to make things easier for casual users.

With ollama I type brew install ollama and then ollama get something, and I have it already running. With llama.ccp it’s seems i have to build it first, then manually download models somewhere - this is an instant turnoff, i maybe have 5 minutes of my life to waste on this

yeah that's literally the point. you're listing something that you think is a disadvantage and some people think exactly the opposite.