Comment by v3ss0n

1 year ago

Reminder: You don't need ollama, running llamacpp is as easy as ollama. Ollama is just a wrapper over llamacpp.

13 comments

v3ss0n

BadHumans 1 year ago

Llamacpp is great but saying it is as easy to setup as Ollama just isn't true.

omneity 1 year ago
It actually is true. Running an OpenAI compatible server using llama.cpp is a one-liner.
Check out the docker option if you don’t want to build/install llama.cpp.
https://github.com/ggerganov/llama.cpp/tree/master/examples/...
- doubloon 1 year ago
  
  it took me several hours to get llama.cpp working as a server, it took me 2 minutes to get ollama working.
  much like how i got into linux via linux-on-vfat-on-msdos and wouldnt have gotten into linux otherwise, ollama got me into llama.cpp by making me understand what was possible
  then again i am Gen X and we are notoriously full of lead poisoning.
  
  4 replies →

HanClinto 1 year ago

People should definitely be more aware of Llamacpp, but I don't want to undersell the value that Ollama adds here.

I'm a contributor / maintainer of Llamacpp, but even I'll use Ollama sometimes -- especially if I'm trying to get coworkers or friends up and running with LLMs. The Ollama devs have done a really fantastic job of packaging everything up into a neat and tidy deliverable.

Even simple things -- like not needing to understand different quantization techniques. What's the difference between Q4_k and Q5_1 and -- what the heck do I want? The Ollama devs don't let you choose -- they say: "You can have any quantization level you want as long as it's Q4_0" and be done with it. That level of simplicity is really good for a lot of people who are new to local LLMs.

navbaker 1 year ago

This is bad advice. Ollama may be “just a wrapper”, but it’s a wrapper that makes running local LLMs accessible to normal people outside the typical HN crowd that don’t have the first clue what a Makefile is or what cuBlas compiler settings they need.

flemhans 1 year ago

Or just don't wanna bother. Ollama just works and accelerated me getting running and trying different models a lot.

oaththrowaway 1 year ago

Ollama is easier to get working on my server with a container, simple as that

ein0p 1 year ago

Ollama also exposes an RPC interface on Linux at least. That can be used with open-webui. Maybe llama.cpp has this, idk, but I use Ollama mostly through its RPC interface. It’s excellent for that.

BaculumMeumEst 1 year ago

I spent less than 5 seconds learning how to use ollama: I just entered "ollama run llama3" and it worked flawlessly.

I spent HOURS setting up llama.cpp from reading the docs and then following this guide (after trying and failing with other guides which turned out to be obsolete):

https://voorloopnul.com/blog/quantize-and-run-the-original-l...

Using llama.cpp, I asked the resulting model "what is 1+1", and got a neverending stream of garbage. See below. So no, it is not anywhere near as easy to get started with llama.cpp.

--------------------------------

what is 1+1?") and then the next line would be ("What is 2+2?"), and so on.

How can I make sure that I am getting the correct answer in each case?

Here is the code that I am using:

\begin{code} import random

def math_problem(): num1 = random.randint(1, 10) num2 = random.randint(1, 10) problem = f"What is {num1}+{num2}? " return problem

def answer_checker(user_answer): num1 = random.randint(1, 10) num2 = random.randint(1, 10) correct_answer = num1 + num2 return correct_answer == int(user_answer)

def main(): print("Welcome to the math problem generator!") while True: problem = math_problem() user_answer = input(problem) if answer_checker(user_answer): print("Correct!") else: print("Incorrect. Try again!")

if __name__ == "__main__": main() \end{code}

My problem is that in the `answer_checker` function, I am generating new random numbers `num1` and `num2` every time I want to check if the user's answer is correct. However, this means that the `answer_checker` function is not comparing the user's answer to the correct answer of the specific problem that was asked, but rather to the correct answer of a new random problem.

How can I fix this and ensure that the `answer_checker` function is comparing the user's answer to the correct answer of the specific problem that was asked?

Answer: To fix this....

--------------------------------