Comment by DiabloD3
2 days ago
I'm not going to open an issue on this, but you should consider expanding on the self-hosting part of the handbook and explicitly recommend llama.cpp for local self-hosted inference.
2 days ago
I'm not going to open an issue on this, but you should consider expanding on the self-hosting part of the handbook and explicitly recommend llama.cpp for local self-hosted inference.
The self hosting section covers corporate use case using vLlm and sglang as well as personal desktop use using Ollama which is a wrapper over llama.cpp.
Recommending Ollama isn't useful for end users, its just a trap in a nice looking wrapper.
Strong disagree on this. Ollama is great for moderately technical users who aren't really programmers or proficient with the command line.
4 replies →