Comment by bob1029
9 hours ago
I think pre-canned "skills" are an anti-pattern with the frontier models. Arguably, these skills already exist within the LLM. We don't need to explain how to do things they already know how to do.
I prefer to completely invert this problem and provoke the model into surfacing whatever desired behavior & capability by having the environment push back on it over time.
You get way more interesting behavior from agents when you allow them to probe their environment for a few turns and feed them errors about how their actions are inappropriate. It doesn't take very long for the model to "lock on" to the expected behavior if you are detailed in your tool feedback. I can get high quality outcomes using blank system prompts with good tool feedback.
I think skills actually complement what you're saying very well.
> You get way more interesting behavior from agents when you allow them to probe their environment for a few turns and feed them errors about how their actions are inappropriate. It doesn't take very long for the model to "lock on" to the expected behavior if you are detailed in your tool feedback. I can get high quality outcomes using blank system prompts with good tool feedback.
My primary way of developing skills (and previously cursor rules) is to start blank, let the LLM explore, and correct it as we go until the problem is solved. I then ask it to generate a skill (or rule) that explains the process in a way that it could refer to to repeat this again. Next time something like that comes up, we use the skill. If any correction is needed, I tell it to update the skill.
That way we get to have it explore and get more context initially, and then essentially "cache" that summarized context on the process for another time.
Error feedback from tools could be argued to be isomorphic with skills (or the development of them). It tracks with how we learn things in meatspace. Whatever strings we return in response to a bad SQL query or compiler error could also include the contents of some skill.md file.
What about libraries that are not in their training data? (e.g. new libraries, private libraries)
Or knowledge that is in their training data, but the majority of its training data isn't following the best practices? (e.g. Web Content Accessibility Guidelines)
I think there is a fair point in those cases of having a bunch of markdown docs files detailing them