Comment by benreesman
4 days ago
I won't use `ollama` on principle. I use `llama-cli` and `llama-server` if I'm not linking `ggml`/`gguf` directly. It's like, two extra commands to use the one by the genius that wrote it and not the one that the guys just jacked it.
The models are on HuggingFace and downloading them is `uvx huggingface-cli`, the `GGUF` quants were `TheBloke` (with a grant from pmarca IIRC) for ages and now everyone does them (`unsloth` does a bunch of them).
Maybe I've got it twisted, but it seems to be that the people who actually do `ggml` aren't happy about it, and I've got their back on this.
No comments yet
Contribute on Hacker News ↗