Comment by kansface

2 days ago

> I just can't figure how _how_ to burn that much money a month responsibly.

I always have a few agents (2-5) doing research and working on plans in parallel. A plan is a thorough and unambiguous document describing the process to implement some feature. It contains goals, non-goals, data models, access patterns, explicit semantics, migrations, phasing, requirements, acceptance criteria, phased and final. Plans often require speculative work to formulate. Plans take hours to days to a couple of weeks to write. Humans may review the plans or derived RFCs. Chiefly AI reviews the code (multiple agents with differing prompts until a fixed point is reached between them). Tests and formal methods are meant to do heavy lifting.

In my highest volume weeks, I ship low hundreds of thousands of lines of software not counting changes to deps.

> At a corporate level, I'd much rather hire a junior engineer

Any formulation of problem sufficient for a truly junior engineer to execute is better given to an agent. The solution is cheaper, faster, and likely better. If the later doesn't hold, 10 independent solutions are still cheaper and faster than a junior engineer.

There is no longer any likely path to teaching a junior engineer the trade.

Just out of curiosity, what type of systems are you working on? What type of features did you implement on your 100k LOC week?

  • I don't know about the GP, but my workflow is similar to theirs, but I aim to ship low thousands of lines per week. The fewer the better. I even tell the agent to only write high SNR tests, otherwise it just adds useless "make sure this function returns this thing we hardcoded".

    I usually succeed, BTW. I spend a lot of time planning, but usually each PR is a few hundred lines, and fairly easily reviewable.

    I mostly work with Python backends, though these days it might be any language (Ruby, Go, TS).

  • > What type of features did you implement on your 100k LOC week?

    I work on 3rd party API integrations, of which, we have hundreds, each in its own repo. We need to build thousands more at a fraction of the cost. Any given integration historically takes a human a few days up to a few months to build and is subject to ongoing maintenance. We frequently do not have access to the API and we mostly never have a representative data set if we do. Complex APIs tend to expose multiple, entwined data models. Documentation may be wrong or in a foreign language.

    I've been building a new framework to do it better. Ideally, we can get an agent to spit them out in a few minutes to hours with a much reduced ops burden for managing the fleet, all with very high confidence. The later requires pushing as much into the type system as possible and leveraging static analysis. Much of the work has been embarrassingly parallelizable. Consider categorizing access patterns across the entire set or ensuring byte for byte parity (over the input space of third party API responses).

    This is absolutely not a problem that a human or 2 could tackle prior to AI.

    • >I've been building a new framework to do it better. So you're using your software factory to build a software factory. Not building thousands of integrations at a fraction of the cost.

I am sorry, I am probably just very dumb, but this sounds extremely wasteful. If this is a reflection of how software was made before AI I wonder how anything was ever made.

> In my highest volume weeks, I ship low hundreds of thousands of lines of software not counting changes to deps.

I suspicious you actually get claude to output that much usable code in a week, but maybe you do.

But I’m 100% positive that you’re not shipping even a small fraction of the amount of value that someone reading this 2 years ago would have expected from hundreds of thousands of lines of code.

I dunno I've seen agents make boneheaded mistakes even a junior engineer wouldn't make. Treating them as strictly better than junior engineers is a problem, not just for that reason but because you're effectively killing the pipline for senior engineers. Then what?

  • > I dunno I've seen agents make boneheaded mistakes even a junior engineer wouldn't make.

    Yes, of course.

    > you're effectively killing the pipline for senior engineers. Then what?

    I honestly don't know _what_. Its a prisoner's dilemma.

You will burn yourself out in months at that level of daily context switching.

It isn't worth it.

> In my highest volume weeks, I ship low hundreds of thousands of lines of software not counting changes to deps

But what do they actually do?

I keep seeing people wax poetic about the mountains and mountains of code that LLMs are dumping out but I'm yet to anywhere near a proportionate amount of actually useful new apps or features. And if anything the useful ones I do find are just more shovels for more AI. When do we get to the part where we start seeing the 10x gains from the billions of lines of code that have probably been generated at this point?