Comment by rgbrenner

19 hours ago

The article has no date on it, but says deferred tool loading is a recent update that occurred after the article was written. Deferred tool loading was added in Nov 2025: https://www.anthropic.com/engineering/advanced-tool-use

So these numbers are at least 7 months out of date. Why is this being posted now?

12 comments

rgbrenner

red_hare 15 hours ago

Its crazy that people are still discussing this. It's ancient history. Deferred tool loading, large contexts, and prompt caching have made 2026 completely different from 2025.

Also, the "CLI saves token" debate really falls apart when step one of using the CLI is running "--help". The problem remains: if knowing how to call the thing isn't in parametric memory, it has to be in context.

fooster 13 hours ago
Build a more specific skill the for the exact workflow you want?
- didibus 13 hours ago
  
  Skill still needs to be loaded in context, what would it change?
  
  3 replies →

mkl 12 hours ago

Older than that, as it implies GPT-4o is current.

wild_egg 18 hours ago

Deferred tool loading is not part of MCP. It's a Claude API special parameter that most other LLM APIs do not support.

red_hare 14 hours ago

OpenAI API also supports defer_loading https://developers.openai.com/api/docs/guides/tools-tool-sea...
And it's not actually necessary for it to exist at the API level. It's a pattern. Making it API-side is just an optimization.
To do it client-side: 1. Define a single tool, tool_search 2. List the names of your deferred tools in context (or tool_search's description) 3. When tool_search is called, match the query against the tool names (or names + descriptions) 4. Append the matched tool def to the context in a new <system>-esque tag
Claude Code (as of the leak) does this client side. You can even see the custom matching function and A/B tests about whether to include the descriptions.
Whether or not that tool definition comes from MCP or a local definition is kind of beside the point.
BeetleB 14 hours ago
On the flip side, Claude is at fault in not letting you choose which tools on which MCP servers to keep in context. When I first starting using MCP about a year ago (not on Claude Code), my tools actually let me selectively turn on/off individual tools.
Crazy that the company that invented MCP is not putting basic features like this in the product.
- didibus 13 hours ago
  
  I think if you deny a tool, it won't be loaded in context at all ever, even it's name and description won't be loaded.
didibus 13 hours ago

Deferred cli/skill loading is also not part of CLIs or skills, it's all about how the coding agent/harness is implemented.