Comment by alex7o

16 hours ago

From all the models that do toolcalls the only thing I am confused is why did you pick the worst? Or maybe they are only bad in agentic work it fine for one shot toolcalls?

Gemini is pretty solid for 1-shot tool call and affordable as well.

  • My general understanding of the concenus on most models these days is that people consider google models to be some of the worst at tool calling, so certainly an interesting choice. Did you do any evals on this?

  • Hi, would love to know where you get that impression on 1 shot tool calling, was there concrete evaluation carried out? pretty new to this and was a bit lost when trying to compare models on different capabilities.