Comment by gner75
1 day ago
I'm not sure I get this. If anything, they'll consume less tokens, because their context will possibly contain a subset of the original single agent prompt, and they only need to see a subset of the original single agent history.
What am I missing?
Take a look at my example here - having a bunch of sub-agents perform a task consumed 50,000+ tokens each across 5 subtasks, because each one had to consume duplicate information. https://simonwillison.net/2025/Oct/11/sub-agents/
But that's down to the way Claude Code has implemented it? If I code this myself I could engineer so that the subagents don't have overlapping context with the orchestrator.
Also, memory itself can be a tool the subagent calls to retrieve only the stuff it needs.