← Back to context

Comment by tarruda

9 hours ago

> It’s tightly integrated into the token sampling infrastructure, and is what ollama builds upon for their json schema functionality.

Do you mean the functionality of generating ebnf grammar and from a json schema use it for sampling is part of ggml, and all they have to do is use it?

I assumed that this was part of llama.cpp, and another feature they have to re-implement and maintain.