Comment by brabel
1 day ago
Then you need to also provide appropriate metadata and format messages correctly according to the format. Which I believe llama.cpp doesn’t do by default, or it can do it? I had trouble formatting messages correctly using llama.cpp due to possibly mismatch in metadata, which ollama seems to handle, but would love to know if this is wrong.
Plus a huggingface token to access models that require you to beg for approval. Ollama hosted models don't require that (which may not be legit but most users don't care).