Comment by vunderba

3 days ago

FWIW, Ollama already does most of this:

- Cross-platform

- Sets up a local API server

The tradeoff is a somewhat higher learning curve, since you need to manually browse the model library and choose the model/quantization that best fit your workflow and hardware. OTOH, it's also open-source unlike LMStudio which is proprietary.

I assumed from the name that it only ran llama-derived models, rather than whatever is available at huggingface. Is that not the case?

  • No, they have quite a broad list of models: https://ollama.com/search

    [edit] Oh and apparently you can also directly run some models directly from HuggingFace: https://huggingface.co/docs/hub/ollama

    • Just use llama.cpp. Ollama tried to force their custom API (not the openai standard), they obscure the downloaded models making them a pain to use with other implementations, blatantly used llama.cpp as a thin wrapper without communicating it properly and now has to differentiate somehow to start making money.

      If you've ever used a terminal, use llama.cpp. You can also directly run models from llama.cpp afaik.

      2 replies →