← Back to context

Comment by exitb

1 day ago

I’m seeing this more and more, where people build this artificial wall you supposedly need to climb to try agentic coding. That’s not the right way to start at all. You should start with a fresh .claude, empty AGENTS.md, zero skills and MCP and learn to operate the thing first.

I'd also go even further and say that you likely should never install ANY skill that you didn't create yourself (i mean, guided claude to create it for you works too), or "forked" an existing one and pulled only what you need.

Everyone's workflow is different and nobody knows which workflow is the right one. If you turn your harness into a junk drawer of random skills that get auto updated, you introduce yet another layer of nondeterminism into it, and also blow up your context window.

The only skill you should probably install instead of maintaining it yourself is playwright-cli, but that's pretty much it.

  • > I'd also go even further and say that you likely should never install ANY skill that you didn't create yourself

    Ignore original comment below, as the post is technical so is the parent comment: for techies

    ---

    That applies to tech users only.

    Non-tech users starting to use Claude code and won't care to get the job done

    Claude introduced skills is to bring more non-tech users to CLI as a good way to get your feet wet.

    Not everyone will go for such minute tweaks.

    • what? non techies are most at risk. There are a huge number of malicious skills. Not knowing or caring how to spot malicious behavior doesn’t mean someone shouldn’t be concerned about it, no matter how much they can’t or don’t want to do it.

      I am an adminstrator of this stuff at my company and it’s an absolute effing nightmare devising policies that protect people from themselves. If I heard this come out of someone’s mouth underneath me I’d tell them to leave the room before I have a stroke.

      And this is stuff like, if so and so’s machine is compromised, it could cost the company massive sums of money. for your personal use, fine, but hearing this cavalier attitude like it doesn’t matter is horrifying, because it absolutely does in a lot of contexts.

      3 replies →

  • I had an issue with playwright MCP where only one Claude Code instance could be using it at a time, so I switched to Claude's built-in /chrome MCP.

    In practice, I also find it more useful that the Chrome MCP uses my current profile since I might want Claude to look at some page I'm already logged in to.

    I'm not very sophisticated here though. I mainly use use browser MCP to get around the fact that 30% of servers block agent traffic like Apple's documentation.

    • Would love if there is a way to parallelize playwright mcp using multiple agents and such, but it seems it's a fundamental limitation of that MCP that only on instance/tab can be controlled.

      Chrome MCP is much slower and by default pretty much unusable because Claude seems to prefer to read state from screenshots. Also, no Firefox/Safari support means no cross-browser testing.

      There appears to be https://github.com/sumyapp/playwright-parallel-mcp which may be worth trying.

    • I was using the built-in chrome skill but it was too unreliable for me. So I switched to playwright cli and I can also have it use firefox to get help debugging browser-specific issues.

  • Yes this is the path I’m taking. Experiment, build your own toolbox whether it’s hand rolled skills or particular skills you pull out from other public repos. Then maintain your own set.

    You do not want to log in one day to find your favorite workflow has changed via updates.

    Then again this is all personal preference as well.

  • I use vanilla Claude Code, and I've never looked that much into skills, so I'm curious: how do you know when it's time to add a new skill?

    • I used them for repeated problems or workflows I encounter when running with the default. If I find myself needing to repeat myself about a certain thing a lot, I put it into claude.md. When that gets too big or I want to have detailed token-heavy instructions that are only occasionally needed, I create a skill.

      I also import skills or groups of skills like Superpowers (https://github.com/obra/superpowers) when I want to try out someone else's approach to claude code for a while.

    • You observe what it does to accomplish a particular task, and note any instances where it:

      1. Had to consume context and turns by reading files, searching web, running several commands for what was otherwise a straightforward task

      2. Whatever tool it used wasn't designed with agent usage in mind. Which most of the time will mean agent has to do tail, head, grep on the output by re-running the same command.

      Then you create a skill that teaches how to do this in fewer turns, possibly even adding custom scripts it can use as part of that skill.

      You almost never need a skill per se, most models will figure things out themselves eventually, skill is usually just an optimization technique.

      Apart from this, you can also use it to teach your own protocols and conventions. For example, I have skills that teach Claude, Codex, Gemini how to communicate between themselves using tmux with some helper scripts. And then another skill that tell it to do a code review using two models from two providers, synthesize findings from both and flag anything that both reported.

      Although, I have abandoned the built-in skill system completely, instead using my own tmux wrapper that injects them using predefined triggers, but this is stepping into more advanced territory. Built in skill system will serve you well initially, and since skills are nothing but markdown files + maybe some scripts, you can migrate them easily into whatever you want later.

This matters for big engineering teams who want to put _some_ kind of guardrails around Claude that they can scale out.

For example, I have a rule [^0] that instructs Claude to never start work until some pre-conditions are met. This works well, as it always seems to check these conditions before doing anything, every turn.

I can see security teams wanting to use this approach to feel more comfortable about devs doing things with agentic tools without worrying _as much_ about them wreaking havoc (or what they consider "havoc").

As well, as someone who's just _really_ getting started with agentic dev, spending time dumping how I work into rules helped Claude not do things I disapprove of, like not signing off commits with my GPG key.

That said, these rules will never be set in stone, at least not at first.

[^0]: https://github.com/carlosonunez/bash-dotfiles/blob/main/ai/c...

  • I'm also thinking on how we can put guardrails on Claude - but more around context changes. For example, if you go and change AGENTS.md, that affects every dev in the repo. How do we make sure that the change they made is actually beneficial? and thinking further, how do we check that it works on every tool/model used by devs in the repo? does the change stay stable over time?

    • Given the scope that AGENTS has, I would use PRs to test those changes and discuss them like any other large-impact area of the codebase (like configs).

      If you wanted to be more “corporate” about it, then assuming that devs are using some enterprise wrapper around Claude or whatever, I would bake an instruction into the system prompt that ensures that AGENTS is only read from the main branch to force this convention.

      This is harder to guarantee since these tools are non-deterministic.

  • NO EXCEPTIONS!!!!!!!!!!!!!!!!!!!!!!!!

    cute that you think cluade gives a rat ass about this.

This article isn't saying you must set up a big .claude folder before you start. It repeats several times that it's important to start small and keep it short.

It's also not targeted at first-timers getting their first taste of AI coding. It's a guide for how to use these tools to deal with frustrations you will inevitably encounter with AI coding.

Though really, many of the complaints about AI coding on HN are written by beginners who would also benefit from a simple .claude configuration that includes their preferences and some guidelines. A frequent complaint from people who do drive-by tests of AI coding tools before giving up is that the tools aren't reading their mind or the tools keep doing things the user doesn't want. Putting a couple lines into AGENTS.md or the .claude folder can fix many of those problems quickly.

Yes, but as soon as you start checking in and sharing access to a project with other developers these things become shared.

Working out how to work on code on your own with agentic support is one thing. Working out how to work on it as a team where each developer is employing agentic tools is a whole different ballgame.

  • But why is it different? Why does it need to be? I don't write code the same as other devs so why would/should I use AI the same?

    Is this a hangover from when the tools were not as good?

    • I'd see this as being useful for two reasons:

      1. Provision of optional tools: I may use an ai agent differently to all other devs on a team, but it seems useful for me to have access to the same set of project-specific commands, skills & MCP configs that my colleagues do. I amn't forced to use them but I can choose to on a case by case basis.

      2. Guardrails: it seems sensible to define a small subset of things you want to dissuade everyone's agents from doing to your code. This is like the agentic extension of coding standards.

    • > I don't write code the same as other devs

      Most people do, most people don’t have wildly different setups do they? I’d bet there’s a lot in common between how you write code and how your coworkers do.

      1 reply →

  • In my own group, agentic coding made sharing and collaboration go out the window because Claude will happily duplicate a bunch of code in a custom framework

    • In my AGENTS.md I have two lines in almost every single one: - Under no condition should you use emoji's. - Before adding a new function, method or class. Scan the project code base, and attached frame works to verify that something else can not be modified to fit the needs.

      1 reply →

    • I think the idea is that by creating these shared .claude files, you tell the agent how to develop for everyone and set shared standards for design patterns/architecture so that each user's agents aren't doing different things or duplicating effort.

Seriously, just use plan mode first and you get like 90% of the way there, with CC launching subagents that will generally do the right thing anyway.

IMHO most of this “customize your config to be more productive” stuff will go away within a year, obsoleted by improved models and harnesses.

Just like how all the lessons for how to use LLMs in code from 1-2 years ago are already long forgotten.

  • I loved all the dumb prompt “hacks” back then like “try saying please”

    • Modern "skills" and Markdown formats of the day are no different than "save the kittens". All of these practices are promoted by influencers and adopted based on wishful thinking and anecdata.

      4 replies →

2 months ago I built (with Claude) a quite advanced Python CLI script and Claude Skill that searches and filters the Claude logs to access information from other sessions or from the same session before context compaction. But today Claude Code has a builtin feature to search its logs and will readily do it when needed.

My point is, these custom things are often short lived band-aids, and may not be needed with better default harnesses or smarter future models.

  • This is very insightful thanks for sharing.

    I’ve been developing and working on dev tools for more than 15 years. I’ve never seen things evolve so rapidly.

    Experiment, have fun and get things done, but don’t get too sure or attached to your patches.

    It’s very likely the models and harnesses will keep improving around the gaps you see.

    I’ve seen most of my AGENTS.md directives and custom tools fade away too, as the agents get better and better at reading the code and running the tests and feeding back on themselves.

I totally agree with you that this not the right way to start. But, in my experience, the more you use the tool the more of a "feel" you get for it, and knowing how all these different pieces work and line up can be quite useful (though certainly not mandatory). It's been immensely frustrating to me how difficult it is to find all this info with all the low-quality junk that is out there on the internet.

  • > all the low-quality junk that is out there on the internet.

    Isn't this article just another one in that same drawer?

    > What actually belongs in CLAUDE.md - Write: - Import conventions, naming patterns, error handling styles

    Then just a few lines below:

    > Don’t write: - Anything that belongs in a linter or formatter config

    The article overall seems filled with internal inconsistencies, so I'm not sure this article is adding much beyond "This is what an LLM generated after I put the article title with some edits".

.claude has become the new dotfiles. And what do people do when they want to start using dotfiles ? they copy other’s dotfiles and same is happening here :)

  • .claude is likely to contain secrets and also contains garbage like cache etc, if it is shared, it should only be partially shared.

I agree with most of this, with one important exception: you should have some form of sandboxing in place before running any local AI agent. The easiest way to do that is with .claude/settings.json[0].

This is important no matter how experienced you are, but arguable the most important when you don't know what you're doing.

0: or if you don't want to learn about that, you can use Claude Code Web

  • The part about permissions with settings.json [0] is laughable. Are we really supposed to list all potential variations of harmful commands? In addition to the `Bash(cat ./.env)`, we would also need to add `Bash(cat .env)`, Bash(tail ./.env)`, Bash(tail .env)`, `Bash(head ./.env)`, `Bash(sed '' ./.env)`, and countless others... while at the same time we allow something like `npm` to run?

    I know the deny list is only for automatically denying, and that non-explicitly allowed command will pause, waiting for user input confirmation. But still it reminds me of the rationale the author of the Pi harness [1] gave to explain why there will be no permission feature built-in in Pi (emphasis mine):

    > If you look at the security measures in other coding agents, *they're mostly security theater*. As soon as your agent can write code and run code, it's pretty much game over. [...] If you're uncomfortable with full access, run pi inside a container or use a different tool if you need (faux) guardrails.

    As you mentioned, this is a big feature of Claude Code Web (or Codex/Antigravity or whatever equivalent of other companies): they handle the sand-boxing.

    [0] https://blog.dailydoseofds.com/i/191853914/settingsjson-perm...

    [1] https://mariozechner.at/posts/2025-11-30-pi-coding-agent/#to...

  • Do people really run claude and other clis like this outside a container??

    • Yes. I don't bother with that. I feel like the risk of Claude Code running amok is pretty low, and I don't have it do long-running tasks that exceeds my desire to monitor it. (Not because I'm worried about it breaking things, it's just I don't use the tool in that way.)

    • Let's not fool ourselves here. If a security feature adds any amount of friction at all, and there's a simple way to disable it, users will choose to do so.

    • I'm sure most folks run Claude without isolation or sandboxing. It's a terrible idea, but even most professional software developers don't think much about security.

      There many decent options (cloud VMs, local VMs, Docker, the built-in sandboxing). My point is just that folks should research and set up at least one of them before running an agent.

    • How did you contain Claude Code? Did you virtualize it? I just set up a simple firejail script for it. Not completely sure if it's enough but it's at least something.

      2 replies →

this is true, but i think people are best off starting with SOME project that gives users an idea of how to organize and think about stuff. for me, this is gastown, and i now have what has gotta be the most custom gastown install out there. could not agree more that your ai experience must be that which you build for yourself, not a productized version that portends to magically agentize your life. i think this is the real genius of gastown— not how it works, but that it does work and yegge built it from his own mind. so i’ve taken the same lesson and run very, very far with it, while also going in a totally different direction in many ways. but it is a work of genius, and i respect the hell out of him for putting it out there.

It's not as bucolic as this when trying to get an org on board. We're currently very open to using Claude, but the unknowns are still the unknowns, so the guardrails the `.claude` folder provides gives us comfort when gaining familiarity with the tool.

Who is building an artificial wall? Maybe I skimmed the post too fast, but it doesn't seem like this information is being presented as "you have to know/do this before you start agentic engineering", just "this is some stuff to know."

Peter Steinberger himself says he's just chatting with AI instead of coming up with crazy coding workflows.

with Anthropic already starting to sell "Claude Certified Architect" exams and a "Partner Network Program", I think a lot of this stuff is around building a side industry on top of it unfortunately

> empty AGENTS.md, zero skills

which is basically every setup because claude sucks at calling skills and forget everything in claude.md with a few seconds.

  • Right? I laughed when I read this:

    >If you tell Claude to always write tests before implementation, it will. If you say “never use console.log for error handling, always use the custom logger module,” it will respect that every time.

    It just isn't true lol

  • Yep, it regularily ignores CLAUDE.md files. It seems these files are not weighted high enough vs. the prompt.