← Back to context

Comment by thot_experiment

2 months ago

I was pretty big on ollama, it seemed like a great default solution. I had alpha that it was a trash organization but I didn't listen because I just liked having a reliable inference backend that didn't require me to install torch. I switched to llama.cpp for everything maybe 6 months ago because of how fucking frustrating every one of my interactions with ollama (the organization) were. I wanna publicly apologize to everyone who's concerns I brushed off. Ollama is a vampire on the culture and their demise cannot come soon enough.

FWIW llama.cpp does almost everything ollama does better than ollama with the exception of model management, but like, be real, you can just ask it to write an API of your preferred shape and qwen will handle it without issue.

Oh I was completely wrong about the model management stuff btw, llama-server has fully fledged model management baked in now, you just have to make an *.ini with your model configs (most models can do this themselves, I pointed qwen3.6 at the relevant part of the docs and it wrote me an ini with all of my model configs in about 2 minutes) and you can swap between models via api or a dropdown menu in the UI.