Comment by albert_e
10 hours ago
I want to create a "harness" that does this with Claude Code and other expensive agents.
Buffer user prompts, use conversation history and repo state as context -- and run a local model or a cheap and fast cloud model like Haiku to determine the optimal way to address the user's ask, reframe the query with better context (user reviews and approves if needed) and THEN let expensive models like Opus have a go at it.
If we are operating within Anthropic ecosystem with Haiku and Opus -- this sort of logic should ideally be doable within Claude Code as harness. Currently skills cannot be tagged to different models. Ideally we should be able to say -- for trivial tasks, the skill should always use Haiku even if invoked from a session with Opus xhigh.
> Currently skills cannot be tagged to different models. Ideally we should be able to say -- for trivial tasks, the skill should always use Haiku even if invoked from a session with Opus xhigh.
You can set the model for a skill. You just set model: haiku at the top and it will use haiku! You can even set the effort level, look for “Frontmatter reference” in this doc article: https://code.claude.com/docs/en/skills
Same works for subagents — .claude/agents/triager.md with model: haiku plus a Task call from the main loop. The reason to roll your own was the sandbox, not the routing.
Interesting - thanks!
Not sure when this was added.
I found "open" feature requests in GitHub asking for this exact thing
We considered wrapping Claude Code when we started building Mendral (this agent in the article). We ended up building our own agent, it's lot more work because we followed all the right patterns as the models evolved (sub-agents, proper token caching, redo basic tools like read,write,edit,bash, etc...). But it paid off over time when you build an agent that is focused on a specific task (not a general coding agent).
The main driver for writing our own agent was to leave it out of the sandbox (the agent loop runs on our backend, we call the sandbox only when needed). We wrote another post about that (it's the latest post on the blog).
However, I am curious how would you implement the triager pattern by only using Claude Code as harness.