← Back to context

Comment by wubrr

17 hours ago

> Does ollama support strict structured output or strict tool calls adhering to a json schema?

As far as I understand this is generally not possible at the model level. Best you can do is wrap the call in a (non-llm) json schema validator, and emit an error json in case the llm output does not match the schema, which is what some APIs do for you, but not very complicated to do yourself.

Someone correct me if I'm wrong

no that's incorrect - llama.cpp has support for providing a context free grammar while sampling and only samples tokens that would conform to the grammar, rather than sampling tokens that would violate the grammar

This is misinformation. Ollama’s supported structured outputs that conform to a given JSON-schema for months. Here’s a post about this from last year: https://ollama.com/blog/structured-outputs

This is absolutely possible to do at the model level via logit shaping. Llama-cpp’s functionality for this is called GBNF. It’s tightly integrated into the token sampling infrastructure, and is what ollama builds upon for their json schema functionality.

  • > It’s tightly integrated into the token sampling infrastructure, and is what ollama builds upon for their json schema functionality.

    Do you mean the functionality of generating ebnf grammar and from a json schema use it for sampling is part of ggml, and all they have to do is use it?

    I assumed that this was part of llama.cpp, and another feature they have to re-implement and maintain.