Comment by gwerbin

6 days ago

We were messing around at work last week building an AI agent that was supposed to only respond with JSON data. GPT and Sonnet more or less what we wanted, but Gemma insisted on giving us a Python code snippet.

10 comments

gwerbin

otabdeveloper4 6 days ago

> that was supposed to only respond with JSON data.

You need to constrain token sampling with grammars if you actually want to do this.

written-beyond 6 days ago
That reduces the quality of the response though.
- debugnik 6 days ago
  
  As opposed to emitting non-JSON tokens and having to throw away the answer?
  
  3 replies →
- Der_Einzige 6 days ago
  
  THIS IS LIES: https://blog.dottxt.ai/say-what-you-mean.html
  I will die on this hill and I have a bunch of other Arxiv links from better peer reviewed sources than yours to back my claim up (i.e. NeurIPS caliber papers with more citations than yours claiming it does harm the outputs)
  Any actual impact of structured/constrained generation on the outputs is a SAMPLER problem, and you can fix what little impact may exist with things like https://arxiv.org/abs/2410.01103
  Decoding is intentionally nerfed/kept to top_k/top_p by model providers because of a conspiracy against high temperature sampling: https://gist.github.com/Hellisotherpeople/71ba712f9f899adcb0...
  
  2 replies →

cubefox 6 days ago

Gemma≠Gemini