Comment by zombot

2 months ago

I don't care about the GUI so much. Ollama lets me download, adjust and run a whole bunch of models and they are reasonably fast. Last time I compared it with Llama.cpp, finding out how to download and install models was a pain in Llama.cpp and it was also _much_ slower than Ollama.

That is not true.

If you today visit a models page on huggingface, the site will show you the exact oneliner you need to run to it on llama.cpp.

I didn't measure it, but both download and inference felt faster than ollama. One thing that was definitely better was memory usage, which may be important if you want to run small models on SCB.

Having picked it up recently and compared to both llama and lm studio - the models I was using ran faster, used less memory, and had a few extra confif options available that the others hadn't implemented yet but were suggested by the model authors.

It was easy to install, run, and access the gui to get going.