Comment by ironbound

23 days ago

Use one of these structured output libraries:

https://github.com/outlines-dev/outlines

https://github.com/jxnl/instructor

https://github.com/guardrails-ai/guardrails

https://www.askmarvin.ai/docs/text/transformation/

Some of them allow a JSON schema, others a Pydantic model (which you can transform to/from JSON).

3 comments

ironbound

zby 20 days ago

Yeah - the author seems oblivious to that option. But to be fair - this post is more about the more basic step of choosing the schema - he argues that it is still a task for a human. In his next post https://www.domainlanguage.com/articles/context-mapping-an-a... applying some output schemas would be more useful.

By they way you don' need to use library to have output schemas: https://platform.openai.com/docs/guides/structured-outputs , https://platform.claude.com/docs/en/build-with-claude/struct...

johndough 20 days ago

To add to that, llama.cpp supports GBNF grammars, which allows generation of JSON, or even programming languages:

https://github.com/ggml-org/llama.cpp/blob/master/grammars/R...

iLoveOncall 20 days ago

All those do is retry the generation if it fails and then give up.

Calling them structured output is a lie, it's structured validation.

If the LLM persists in generating bad JSON, those are useless.