Comment by zby
5 days ago
I still don't get what is special about the skills directory - since like forever I instructed Claud Code - "please read X and do Y" - how skills are different from that?
5 days ago
I still don't get what is special about the skills directory - since like forever I instructed Claud Code - "please read X and do Y" - how skills are different from that?
They're not. They are just a formalization of that pattern, with a very tiny extra feature where the model harness scans that folder on startup and loads some YAML metadata into the system prompt so it knows which ones to read later on.
So "skills" are a hack around the LLM not actually being very smart? Interesting.
Everything we do with LLMs is a hack around them not actually being very smart!
Working around their many limitations has been the nature of the game since the original GPT-3.
It's more that they are embracing that the LLM is smart enough that you don't need to build-in this functionality beyond that very minimal part.
A fun thing: Claude Code will sometimes fail to find the skill the "proper" way, and will then in fact sometimes look for the SKILL.md file with tools, and read the file with tools, showing that it's perfectly capable of doing all the steps.
You could probably "fake" skills pretty well with instructions in CLAUDE.md to use a suitable command to extract the preamble of files in a given directory, and tell it to use that to decide when to read the rest.
It's the fact that it's such a thin layer that is exciting - it means we need increasingly less special logic other than relying on just basic instructions to the model itself.
No, skills are a set of manifested and tested 'skills' which reduce the 'mental load' of the LLM and reduces the context the LLM needs to do things reproducable.
Similiar to what humans do.
More not wasting context having it figure it out.
It’s documentation vs researching how to do something.
The difference is that the code in the directory (and the markdown) are hardcoded and known to work beforehand.
But we are still reliant on the LLM correctly interpreting the choice to pick the right skill. So "known to work" should be understood in the very limited context of "this sub-function will do what it was designed to do reliably" rather than "if the user asks to use this sub-function it will do was it was designed to do reliably".
Skills feel like a non-feature to me. It feels more valuable to connect a user to the actual tool and let them familiarize themselves with it (and not need the LLM to find it in the future) rather than having the tool embedded in the LLM platform. I will carve out a very big exception of accessibility here - I love my home device being an egg timer - it's a wonderful egg timer (when it doesn't randomly play music) and I could buy an egg timer but having a hands-free egg timer is actually quite valuable to me while cooking. So I believe there is real value in making these features accessible through the LLM over media that the feature would normally be difficult to use in.
This is no different to an MCP, where you rely on the model to use the metadata provided to pick the right tool, and understand how to use it.
Like with MCP, you can provide a deterministic, known-good piece of code to carry out the operation once the LLM decides to use it.
But a skill can evolve from pure Markdown via inlining some shell commands, up to a large application. And if you let it, with Skills the LLM can also inspect the tool, and modify it if it will help you.
All the Skills I use now have evolved bit by bit as I've run into new use-cases and told Claude Code to update the script the skills references or the SKILL.md itself. I can evolve the tooling while I'm using it.
Choice to pick right tool -- there is a benchmark which tracks the accuracy of this.
"Known to work" -- if it has a hardcoded code, it will work 100% of the time - that's the point of Skills. If it's just markdown then yes, some sort of probability will be there and it will keep on improving.
Not really special, just officially supported and I'm guessing how best to use it baked in via RL. Claude already knows how skills work vs learning your own home-rolled solution.