Comment by peer0

21 days ago

This seems similar to what I done using llama.cpp's "Grammar constrained generation" for my local agents. But using that instead of catching and retrying it is just literally impossible for the LLM to generate something that doesn't match a specific schema of tool choices. It is amazing how much better small models can be when you reduce the problem space to only grammatically correct answers.

Interesting, catching the problem upstream, effectively. How did you enforce the grammar?