Comment by rao-v

13 hours ago

In a month or three we’ll have the sensible approach, which is smaller cheaper fast models optimized for looking at a query and identifying which skills / context to provide in full to the main model.

It’s really silly to waste big model tokens on throat clearing steps

I thought most of the major AI programming tools were already doing this. Isn't this what subagents are in Claude code?

  • I don't know about Claude Code but in GitHub Copilot as far as I can tell the subagents are just always the same model as the main one you are using. They also need to be started manually by the main agent in many cases, whereas maybe the parent comment was referring about calling them more deterministically?

  • Sub-agents are typically one of the major models but with a specific and limited context + prompt. I’m talking about a small fast model focused on purely curating the skills / MCPs / files to provide to the main model before it kicks off.

    Basically use a small model up front to efficiently trigger the big model. Sub agents are at best small models deployed by the bigger model (still largely manually triggered in most workflows today)