← Back to context

Comment by vitaelabitur

14 hours ago

> Increasing context length by complaining about schema errors is almost always worse from an end quality perspective than just retrying till the schema passes.

Another way to do this is to use a hybrid approach. You perform unconstrained generation first, and then constrained generation on the failures.

There's no difference in the output distribution between always doing constrained generation and only doing it on the failures though. What's the advantage?

  • There's no advantage wrt output quality, but it can be more economical in some high-error regimes, with less LLM calls used in resampling (max 2 for most errors).

    • My point is that if you're capable of doing constrained generation and want to try once and the constrain on failure, since that has the same output distribution as doing constrained generation in the first place, you'd be better off just doing constrained generation always (max of 1 LLM call for the class of errors fixed by this).

      There's only a different distribution with 2+ initial attempts before falling back to constrained, at least if I haven't screwed up any math.