Comment by alexsmirnov
9 hours ago
This was implemented far ago, at least by huggingface "smolagents". https://huggingface.co/docs/smolagents/index . I did use them, with evaluations. For the most cases, modern models tool call outperforms code agent. They just trained to use tools, not a code
The differentiating thing that llm tool calls can't do reliably is to handle a lot of data. if tool a emit data that tool b needs, and it's a significant compared to model context, scripting these tool to be chained in a code fragment where they are exposed as functions saves a lot of pain
I had the same experience using smolagents. Early 2025 it was a competitive approach, but a year later having a small subset (<10) of flexible tools is outperforming the single-tool approach.