← Back to context

Comment by jychang

2 days ago

?

    llamacpp> ls -l \*llama\*
    -rwxr-xr-x 1 root root 2505480 Aug  7 05:06 libllama.so
    -rwxr-xr-x 1 root root 5092024 Aug  7 05:23 llama-server

That's a terrible excuse, Llama.cpp is just 7.5 megabytes. You can easily ship a couple copies of that. The current ollama for windows download is 700MB.

I don't buy it. They're not willing to make an 700MB download a few megabytes bigger to ~730MB, but they are willing to support a fork/rewrite indefinitely (and the fork is outside of their core competency, as seen by the current issues)? What kind of decisionmaking is that?

It’s 700mib because they’re likely redistributing the CUDA libraries so that users don’t need to separately run that installer. Llama.cpp is a bit more “you are expected to know what you’re doing” on that front. But yeah, you could plausibly ship multiple versions of the inference engine although from a maintenance perspective that sounds like hell for any number of reasons