Comment by OutOfHere

4 days ago

That's all fine, but it should be noted that proper tool-calling using the LLM's structured response functionality guarantees a compliant response because invalid responses are culled as they're generated.

But now you're limited by the model, provider and the model's adherence to the output.

While using structured outputs is great, it can cause large performance impacts and you lose control over it - i.e. using a smaller model via groq fix the invalid response often times works faster than having a large model generate a structured response.

Have 50 tools? It's faster and more precise to just stack 2 small models or do a search and just pass in basic definitions for each and have it output a function call than to feed it all 50 tools defined as JSON.

While structured response itself is fine, it really depends on the usecase and on the provider. If you can handle the loss of compute seconds, yeah it's great. If you can't, then nothing beats having absolute control over your provider, model and output choice.