Comment by rgbrenner

19 hours ago

The article has no date on it, but says deferred tool loading is a recent update that occurred after the article was written. Deferred tool loading was added in Nov 2025: https://www.anthropic.com/engineering/advanced-tool-use

So these numbers are at least 7 months out of date. Why is this being posted now?

+1

Its crazy that people are still discussing this. It's ancient history. Deferred tool loading, large contexts, and prompt caching have made 2026 completely different from 2025.

Also, the "CLI saves token" debate really falls apart when step one of using the CLI is running "--help". The problem remains: if knowing how to call the thing isn't in parametric memory, it has to be in context.

Deferred tool loading is not part of MCP. It's a Claude API special parameter that most other LLM APIs do not support.

  • OpenAI API also supports defer_loading https://developers.openai.com/api/docs/guides/tools-tool-sea...

    And it's not actually necessary for it to exist at the API level. It's a pattern. Making it API-side is just an optimization.

    To do it client-side: 1. Define a single tool, tool_search 2. List the names of your deferred tools in context (or tool_search's description) 3. When tool_search is called, match the query against the tool names (or names + descriptions) 4. Append the matched tool def to the context in a new <system>-esque tag

    Claude Code (as of the leak) does this client side. You can even see the custom matching function and A/B tests about whether to include the descriptions.

    Whether or not that tool definition comes from MCP or a local definition is kind of beside the point.

  • On the flip side, Claude is at fault in not letting you choose which tools on which MCP servers to keep in context. When I first starting using MCP about a year ago (not on Claude Code), my tools actually let me selectively turn on/off individual tools.

    Crazy that the company that invented MCP is not putting basic features like this in the product.

    • I think if you deny a tool, it won't be loaded in context at all ever, even it's name and description won't be loaded.

  • Deferred cli/skill loading is also not part of CLIs or skills, it's all about how the coding agent/harness is implemented.