Comment by axegon_

1 year ago

I think you are missing the point. To get things straight: llama.cpp is not hard to setup and get running. It was a bit of a hassle in 2023 but even then it was not catastrophically complicated if you were willing to read the errors you were getting. People are dissatisfied for two, very valid reasons: ollama gives little to no credit to llama.cpp. The second one is the point of the post: a PR has been open for over 6 months and not a huge PR at that has been completely ignored. Perhaps the ollama maintainers personally don't have use for it so they shrugged it off but this is the equivalent of "it works on my computer". Imagine if all kernel devs used Intel CPUs and ignored every non-intel CPU-related PR. I am not saying that the kernel mailing list is not a large scale version of a countryside pub on a Friday night - it is. But the maintainers do acknowledge the efforts of people making PRs and do a decent job at addressing them. While small, the PR here is not trivial and should have been, at the very least, discussed. Yes, the workstation/server I use for running models uses two Nvidia GPU's. But my desktop computer uses an Intel Arc and in some scenarios, hypothetically, this pr might have been useful.

3 comments

axegon_

lolinder 1 year ago

> To get things straight: llama.cpp is not hard to setup and get running. It was a bit of a hassle in 2023 but even then it was not catastrophically complicated if you were willing to read the errors you were getting.

It's made a lot of progress in that the README [0] now at least has instructions for how to download pre-built releases or docker images, but that requires actually reading the section entitled "Building the Project" to realize that it provides more than just building instructions. That is not accessible to the masses, and it's hard for me to not see that placement and prioritization as an intentional choice to be inaccessible (which is a perfectly valid choice for them!)

And that's aside from the fact that Ollama provides a ton of convenience features that are simply missing, starting with the fact that it looks like with llama.cpp I still have to pick a model at startup time, which means switching models requires SSHing into my server and restarting it.

None of this is meant to disparage llama.cpp: what they're doing is great and they have chosen to not prioritize user convenience as their primary goal. That's a perfectly valid choice. And I'm also not defending Ollama's lack of acknowledgment. I'm responding to a very specific set of ideas that have been prevalent in this thread: that not only does Ollama not give credit, they're not even really doing very much "real work". To me that is patently nonsense—the last mile to package something in a way that is user friendly is often at least as much work, it's just not the kind of work that hackers who hang out on forums like this appreciate.

[0] https://github.com/ggerganov/llama.cpp

portaouflop 1 year ago

llama.ccp is hard to set up - I develop software for a living and it wasn’t trivial for me. ollama I can give to my non-technical family members and they know how to use it.

As for not merging the PR - why are you entitled to have a PR merged? This attitude of entitlement around contributions is very disheartening as oss maintainer - it’s usually more work to review/merge/maintain a feature etc than to open a PR. Also no one is entitled to comments / discussion or literally one second of my time as an OSS maintainer. This is imo the cancer that is eating open source.

tharant 1 year ago

> As for not merging the PR - why are you entitled to have a PR merged?
I didn’t get entitlement vibes from the comment; I think the author believes the PR could have wide benefit, and believes that others support his position, thus the post to HN.
I don’t mean to be preach-y; I’m learning to interpret others by using a kinder mental model of society. Wish me luck!