Comment by simonw

1 day ago

The structured data JSON output thing is a special case: it works by interacting directly with the "select next token" mechanism, restricting the LLM to only picking from a token that would be valid given the specified schema.

This makes invalid output (as far as the JSON schema goes) impossible, with one exception: if the model runs out of output tokens the output could be an incomplete JSON object.

Most of the other things that people call "guardrails" offer far weaker protection - they tend to use additional models which can often be tricked in other ways.

1 comment

simonw

manquer 1 day ago

You are right of course.

I didn't mean to imply that all methods give 100% reliability as the structured data does. My point was just that there are non system prompt approaches which give on par or better reliability and/or injection security, it is not just system prompt or bust as other posters suggest.