Comment by jgbuddy

23 days ago

Am I missing something here?

Obviously directly including context in something like a system prompt will put it in context 100% of the time. You could just as easily take all of an agent's skills, feed it to the agent (in a system prompt, or similar) and it will follow the instructions more reliably.

However, at a certain point you have to use skills, because including it in the context every time is wasteful, or not possible. this is the same reason anthropic is doing advanced tool use ref: https://www.anthropic.com/engineering/advanced-tool-use, because there's not enough context to straight up include everything.

It's all a context / price trade off, obviously if you have the context budget just include what you can directly (in this case, compressing into a AGENTS.md)

22 comments

jgbuddy

jstummbillig 23 days ago

> Obviously directly including context in something like a system prompt will put it in context 100% of the time.

How do you suppose skills get announced to the model? It's all in the context in some way. The interesting part here is: Just (relatively naively) compressing stuff in the AGENTS.md seems to work better than however skills are implemented.

cortesoft 23 days ago
Isn't the difference that a skill means you just have to add the script name and explanation to the context instead of the entire script plus the explanation?
- majormajor 23 days ago
  
  Their non-skill based "compressed index" is just similarly "Each line maps a directory path to the doc files it contains" but without "skillification." They didn't load all those things into context directly, just pointers.
  They also didn't bother with any more "explanation" beyond "here are paths for docs."
  But this straightforward "here are paths for docs" produced better results, and IMO it makes sense since the more extra abstractions you add, the more chance of a given prompt + situational context not connecting with your desired skill.
- sevg 23 days ago
  
  You could put the name and explanation in CLAUDE.md/AGENTS.md, plus the path to the rest of the skill that Claude can read if needed.
  That seems roughly equivalent to my unenlightened mind!
- verdverm 23 days ago
  
  I like to think about it this way, you want to put some high level, table of contents, sparknotes like stuff in the system prompt. This helps warm up the right pathways. In this, you also need to inform that there are more things it may need, depending on "context", through filesystem traversal or search tools, the difference is unimportant, other than most things outside of coding typically don't do filesystem things the same way
  
  4 replies →
jmathai 23 days ago
Skills have frontmatter which includes a name and description. The description is what determines if the llm finds the skill useful for the task at hand.
If your agent isn’t being used, it’s not as simple as “agents aren’t getting called”. You have to figure out how to get the agent invoked.
- Spivak 23 days ago
  
  Sure, but then you're playing a very annoying and boring game of model-whispering to specific versions of models that are ever changing as well as trying to hopefully get it to respond correctly with who knows what user input surrounds it.
  I really only think the game is worth playing when it's against a fixed version of a specific model. The amount of variance we observe between different releases of the same model is enough to require us to update our prompts and re-test. I don't envy anyone who has to try and find some median text that performs okay on every model.
  
  1 reply →

observationist 23 days ago

This is one of the reasons the RLM methodology works so well. You have access to as much information as you want in the overall environment, but only the things relevant to the task at hand get put into context for the current task, and it shows up there 100% of the time, as opposed to lossy "memory" compaction and summarization techniques, or probabilistic agent skills implementations.

Having an agent manage its own context ends up being extraordinarily useful, on par with the leap from non-reasoning to reasoning chats. There are still issues with memory and integration, and other LLM weaknesses, but agents are probably going to get extremely useful this year.

judahmeek 23 days ago

> only the things relevant to the task at hand get put into context for the current task
And how do you guarantee that said relevant things actually get put into the context?
OP is about the same problem: relevant skills being ignored.

_the_inflator 23 days ago

I agree with you.

I think Vercel mixes skills and context configuration up. So the whole evaluation is totally misleading because it tests for two completely different use cases.

To sum it up: Vercel should us both files, agents.md is combination with skills. Both functions have two totally different purposes.

verdverm 23 days ago

You aren't wrong, you really want a bit of both.

1. You absolutely want to force certain context in, no questions or non-determinism asked (index and sparknotes). This can be done conditionally, but still rule based on the files accessed and other "context"

2. You want to keep it clean and only provide useful context as necessary (skills, search, mcp; and really a explore/query/compress mechanism around all of this, ralph wiggum is one example)

teknopaul 23 days ago

My reading was that copying the doc's ToC in markdown + links was significantly more effective than giving it a link to the ToC and instructions to read it.

Which makes sense.

& some numbers that prove that.

singingbard 22 days ago

So you’re not missing anything if you use Claude by yourself. You just update your local system prompt.

Instead it’s a problem when you’re part of a team and you’re using skills for standards like code style or architectural patterns. You can’t ask everyone to constantly update their system prompt.

Claude skill adherence is very low.

orlandohohmeier 23 days ago

I’ve been using symlinked agent files for about a year as a hacky workaround before skils became a thing load additional “context” for different tasks, and it might actually address the issue you’re talking about. Honestly, it’s worked so well for me that I haven’t really felt the need to change it.

mbm 23 days ago

What sort of files do you generally symlink in?

deaux 23 days ago

You're right, the results are completely as expected.

The article also doesn't mention that they don't know how the compressed index output quality. That's always a concern with this kind of compression. Skills are just another, different kind of compression. One with a much higher compression rate and presumably less likely to negatively influence quality. The cost being that it doesn't always get invoked.

TeeWEE 23 days ago

Indeed seems like Vercel completely missed the point about agents.

In Claude Code you can invoke an agent when you want as a developer and it copies the file content as context in the prompt.