← Back to context

Comment by lanthissa

1 day ago

its not though if you're working in a massive codebase or on a distributed system that has many interconnected parts.

skills that teach the agent how to pipe data, build requests, trace them through a system and datasources, then update code based on those results are a step function improvement in development.

ai has fundamentally changed how productive i am working on a 10m line codebase, and i'd guess less than 5% of that is due to code gen thats intended to go to prod. Nearly all of it is the ability to rapidly build tools and toolchains to test and verify what i'm doing.

But... plain Claude does that. At least for my codebase, which is nowhere close to your 10m line. But we do processing on lots of data (~100TB) and Claude definitely builds one-off tools and scripts to analyze it, which works pretty great in my experience.

What sort of skills are you referring to?

  • I think people are looking at skills the wrong way. It's not like it gives it some kind of superpowers it couldn't do otherwise. Ideally you'll have Claude write the skills anyway. It's just a shortcut so you don't have to keep rewriting a prompt all over again and/or have Claude keep figuring out how to do the same thing repeatedly. You can save lots of time, tokens and manual guidance by having well thought skills. Some people use these to "larp" some kind of different job roles etc and I don't think that's productive use of skills unless the prompts are truly exceptional.

    • At work I use skills to maintain code consistency. We instrumented a solid "model view viewmodel" architecture for a front-end app, because without any guard rails it was doing redundant data fetching and type casts and just messy overall. Having a "mvvm" rule and skill that defines the boundaries keeps the llm from writing a bunch of nonsense code that happens to work.

      2 replies →

    • I have sometimes found "LARPing job roles" to be useful for expectations for the codebase.

      Claude is kind of decent at doing "when in Rome" sort of stuff with your codebase, but it's nice to reinforce, and remind it how to deploy, what testing should be done before a PR, etc.

  • If you build up and save some of those scripts, skills help Claude remember how and when to use them.

    Skills are crazy useful to tell Claude how to debug your particular project, especially when you have a library of useful scripts for doing so.

Even the most complex distributed systems can be understood with the context windows we have. Short of 1M+ loc, and even then you could use documentation to get a more succinct view of the whole thing.

  • This really doesn’t pan out in practice if you work a lot with these models

    And also we know why: effective context depends on inout and task complexity. Our best guess right now is that we are often between 100k to 200k effective context length for frontier, 1m NIHS type models