← Back to context

Comment by evilduck

7 days ago

Llama.cpp is not really that easy unless you're supported by their prebuilt binaries. Go to the llama.cpp GitHub page and find a prebuilt CUDA enabled release for a Fedora based linux distro. Oh there isn't one you say? Welcome to losing an hour or more of your time.

Then you want to swap models on the fly. llama-swap you say? You now get to learn a new custom yaml based config file syntax that does basically nothing that the Ollama model file already does so that you can ultimately... have the same experience as Ollama but now you've lost hours just to get back to square one.

Then you need it to start and be ready with the system reboot? Great, now you get to write some systemd services, move stuff into system-level folders, create some groups and users and poof, there goes another hour of your time.