Comment by nl
19 hours ago
Try some of the models on OpenRouter if you are looking to save money. Gemma 4 31B is $0.12/M input, $0.37/M output vs $1/M input, $5/M output for Haiku.
There are other options that are good too. Gemini 3.1 Flash Lite is great for this kind of thing (NOT Gemini 3.5 Flash though - the pricing for that is bad).
Cheers, I'll give it a try. How are those models at returning structured results? When I was writing the prompts for the analysis step and testing with older Claude models, it would have trouble structuring the XML consistently. Sonnet 4.6 handles it really well.
Use function calling/tool use, not XML output. The models are all trained for that now.
Ie, instead of telling it to generate
give it a function
That is actually a JSON schema, and the models do great at it. Here's the claude docs, but they are all similar: https://platform.claude.com/docs/en/agents-and-tools/tool-us...
Very interesting. Thank you!
[flagged]