They didn't write 100k plan lines. The llm did (99.9% of it at least or more). Writing 30k by hand would take weeks if not months. Llms do it in an afternoon.
You don't start with 100k lines, you work in batches that are digestible. You read it once, then move on. The lines add up pretty quickly considering how fast Claude works. If you think about the difference in how many characters it takes to describe what code is doing in English, it's pretty reasonable.
I have no doubts that it does for many people. But the time/cost tradeoff is still unquestionable. I know I could create what LLMs do for me in the frontend/backend in most cases as good or better - I know that, because I've done it at work for years. But to create a somewhat complex app with lots of pages/features/apis etc. would take me months if not a year++ since I'd be working on it only on the weekends for a few hours. Claude code helps me out by getting me to my goal in a fraction of the time. Its superpower lies not only in doign what I know but faster, but in doing what I don't know as well.
I yield similar benefits at work. I can wow management with LLM assited/vibe coded apps. What previously would've taken a multi-man team weeks of planning and executing, stand ups, jour fixes, architecture diagrams, etc. can now be done within a single week by myself. For the type of work I do, managers do not care whether I could do it better if I'd code it myself. They are amazed however that what has taken months previously, can be done in hours nowadays. And I for sure will try to reap benefits of LLMs for as long as they don't replace me rather than being idealistic and fighting against them.
Might be true for you. But there are plenty of top tier engineers who love LLMs. So it works for some. Not for others.
And of course there are shortcuts in life. Any form of progress whether its cars, medicine, computers or the internet are all shortcuts in life. It makes life easier for a lot of people.
They write a short high level plan (let's say 200 words). The plan asks the agent to write a more detailed implementation plan (written by the LLM, let's say 2000-5000 words).
They read this plan and adjust as needed, even sending it to the agent for re-dos.
Once the implementation plan is done, they ask the agent to write the actual code changes.
Then they review that and ask for fixes, adjustments, etc.
This can be comparable to writing the code yourself but also leaves a detailed trail of what was done and why, which I basically NEVER see in human generated code.
That alone is worth gold, by itself.
And on top of that, if you're using an unknown platform or stack, it's basically a rocket ship. You bootstrap much faster. Of course, stay on top of the architecture, do controlled changes, learn about the platform as you go, etc.
I take this concept and I meta-prompt it even more.
I have a road map (AI generated, of course) for a side project I'm toying around with to experiment with LLM-driven development. I read the road map and I understand and approve it. Then, using some skills I found on skills.sh and slightly modified, my workflow is as such:
1. Brainstorm the next slice
It suggests a few items from the road map that should be worked on, with some high level methodology to implement. It asks me what the scope ought to be and what invariants ought to be considered. I ask it what tradeoffs could be, why, and what it recommends, given the product constraints. I approve a given slice of work.
NB: this is the part I learn the most from. I ask it why X process would be better than Y process given the constraints and it either corrects itself or it explains why. "Why use an outbox pattern? What other patterns could we use and why aren't they the right fit?"
2. Generate slice
After I approve what to work on next, it generates a high level overview of the slice, including files touched, saved in a MD file that is persisted. I read through the slice, ensure that it is indeed working on what I expect it to be working on, and that it's not scope creeping or undermining scope, and I approve it. It then makes a plan based off of this.
3. Generate plan
It writes a rather lengthy plan, with discrete task bullets at the top. Beneath, each step has to-dos for the llm to follow, such as generating tests, running migrations, etc, with commit messages for each step. I glance through this for any potential red flags.
4. Execute
This part is self explanatory. It reads the plan and does its thing.
I've been extremely happy with this workflow. I'll probably write a blog post about it at some point.
If you want to have some fun, experiment with this: add a step (maybe between 3 and 4):
3.5 Prove
Have the LLM demonstrate, through our current documentation and other sources of facts, that the planned action WILL work correctly, without failure. Ask it to enumerate all risks and point out how the plan mitigates each risk. I've seen on several occasions, the LLM backtrack at this step and actually come up with clever so-far unforeseen error cases.
> Haven't seen a single useful thing produced by this garbage process you describe
By using it first-hand or by a colleague? And useful to whom, you, or the person writing it? There are plenty of people in this thread who have actually used this "garbage process," myself included, to produce stuff we, and our colleagues, find is useful.
They didn't write 100k plan lines. The llm did (99.9% of it at least or more). Writing 30k by hand would take weeks if not months. Llms do it in an afternoon.
Just reading that plan would take weeks or months
You don't start with 100k lines, you work in batches that are digestible. You read it once, then move on. The lines add up pretty quickly considering how fast Claude works. If you think about the difference in how many characters it takes to describe what code is doing in English, it's pretty reasonable.
And my weeks or months of work beats an LLMs 10/10 times. There are no shortcuts in life.
I have no doubts that it does for many people. But the time/cost tradeoff is still unquestionable. I know I could create what LLMs do for me in the frontend/backend in most cases as good or better - I know that, because I've done it at work for years. But to create a somewhat complex app with lots of pages/features/apis etc. would take me months if not a year++ since I'd be working on it only on the weekends for a few hours. Claude code helps me out by getting me to my goal in a fraction of the time. Its superpower lies not only in doign what I know but faster, but in doing what I don't know as well.
I yield similar benefits at work. I can wow management with LLM assited/vibe coded apps. What previously would've taken a multi-man team weeks of planning and executing, stand ups, jour fixes, architecture diagrams, etc. can now be done within a single week by myself. For the type of work I do, managers do not care whether I could do it better if I'd code it myself. They are amazed however that what has taken months previously, can be done in hours nowadays. And I for sure will try to reap benefits of LLMs for as long as they don't replace me rather than being idealistic and fighting against them.
5 replies →
Might be true for you. But there are plenty of top tier engineers who love LLMs. So it works for some. Not for others.
And of course there are shortcuts in life. Any form of progress whether its cars, medicine, computers or the internet are all shortcuts in life. It makes life easier for a lot of people.
That's not (or should not be what's happening).
They write a short high level plan (let's say 200 words). The plan asks the agent to write a more detailed implementation plan (written by the LLM, let's say 2000-5000 words).
They read this plan and adjust as needed, even sending it to the agent for re-dos.
Once the implementation plan is done, they ask the agent to write the actual code changes.
Then they review that and ask for fixes, adjustments, etc.
This can be comparable to writing the code yourself but also leaves a detailed trail of what was done and why, which I basically NEVER see in human generated code.
That alone is worth gold, by itself.
And on top of that, if you're using an unknown platform or stack, it's basically a rocket ship. You bootstrap much faster. Of course, stay on top of the architecture, do controlled changes, learn about the platform as you go, etc.
I take this concept and I meta-prompt it even more.
I have a road map (AI generated, of course) for a side project I'm toying around with to experiment with LLM-driven development. I read the road map and I understand and approve it. Then, using some skills I found on skills.sh and slightly modified, my workflow is as such:
1. Brainstorm the next slice
It suggests a few items from the road map that should be worked on, with some high level methodology to implement. It asks me what the scope ought to be and what invariants ought to be considered. I ask it what tradeoffs could be, why, and what it recommends, given the product constraints. I approve a given slice of work.
NB: this is the part I learn the most from. I ask it why X process would be better than Y process given the constraints and it either corrects itself or it explains why. "Why use an outbox pattern? What other patterns could we use and why aren't they the right fit?"
2. Generate slice
After I approve what to work on next, it generates a high level overview of the slice, including files touched, saved in a MD file that is persisted. I read through the slice, ensure that it is indeed working on what I expect it to be working on, and that it's not scope creeping or undermining scope, and I approve it. It then makes a plan based off of this.
3. Generate plan
It writes a rather lengthy plan, with discrete task bullets at the top. Beneath, each step has to-dos for the llm to follow, such as generating tests, running migrations, etc, with commit messages for each step. I glance through this for any potential red flags.
4. Execute
This part is self explanatory. It reads the plan and does its thing.
I've been extremely happy with this workflow. I'll probably write a blog post about it at some point.
If you want to have some fun, experiment with this: add a step (maybe between 3 and 4):
3.5 Prove
Have the LLM demonstrate, through our current documentation and other sources of facts, that the planned action WILL work correctly, without failure. Ask it to enumerate all risks and point out how the plan mitigates each risk. I've seen on several occasions, the LLM backtrack at this step and actually come up with clever so-far unforeseen error cases.
1 reply →
This is a super helpful and productive comment. I look forward to a blog post describing your process in more detail.
4 replies →
Yep with a human in the loop to process these larger sprawling plan docs (inflated with the intent of the designer iteratively)
Some get deleted from repo others archived, others merged or referenced elsewhere. It's kind of organic.
[flagged]
> Haven't seen a single useful thing produced by this garbage process you describe
By using it first-hand or by a colleague? And useful to whom, you, or the person writing it? There are plenty of people in this thread who have actually used this "garbage process," myself included, to produce stuff we, and our colleagues, find is useful.
3 replies →
What genuinely new thing have you produced?
3 replies →