Comment by oblio

3 days ago

That's not (or should not be what's happening).

They write a short high level plan (let's say 200 words). The plan asks the agent to write a more detailed implementation plan (written by the LLM, let's say 2000-5000 words).

They read this plan and adjust as needed, even sending it to the agent for re-dos.

Once the implementation plan is done, they ask the agent to write the actual code changes.

Then they review that and ask for fixes, adjustments, etc.

This can be comparable to writing the code yourself but also leaves a detailed trail of what was done and why, which I basically NEVER see in human generated code.

That alone is worth gold, by itself.

And on top of that, if you're using an unknown platform or stack, it's basically a rocket ship. You bootstrap much faster. Of course, stay on top of the architecture, do controlled changes, learn about the platform as you go, etc.

18 comments

oblio

abustamam 3 days ago

I take this concept and I meta-prompt it even more.

I have a road map (AI generated, of course) for a side project I'm toying around with to experiment with LLM-driven development. I read the road map and I understand and approve it. Then, using some skills I found on skills.sh and slightly modified, my workflow is as such:

1. Brainstorm the next slice

It suggests a few items from the road map that should be worked on, with some high level methodology to implement. It asks me what the scope ought to be and what invariants ought to be considered. I ask it what tradeoffs could be, why, and what it recommends, given the product constraints. I approve a given slice of work.

NB: this is the part I learn the most from. I ask it why X process would be better than Y process given the constraints and it either corrects itself or it explains why. "Why use an outbox pattern? What other patterns could we use and why aren't they the right fit?"

2. Generate slice

After I approve what to work on next, it generates a high level overview of the slice, including files touched, saved in a MD file that is persisted. I read through the slice, ensure that it is indeed working on what I expect it to be working on, and that it's not scope creeping or undermining scope, and I approve it. It then makes a plan based off of this.

3. Generate plan

It writes a rather lengthy plan, with discrete task bullets at the top. Beneath, each step has to-dos for the llm to follow, such as generating tests, running migrations, etc, with commit messages for each step. I glance through this for any potential red flags.

4. Execute

This part is self explanatory. It reads the plan and does its thing.

I've been extremely happy with this workflow. I'll probably write a blog post about it at some point.

ryandrake 3 days ago
If you want to have some fun, experiment with this: add a step (maybe between 3 and 4):
3.5 Prove
Have the LLM demonstrate, through our current documentation and other sources of facts, that the planned action WILL work correctly, without failure. Ask it to enumerate all risks and point out how the plan mitigates each risk. I've seen on several occasions, the LLM backtrack at this step and actually come up with clever so-far unforeseen error cases.
- abustamam 3 days ago
  
  That's a good thought experiment!
jalopy 3 days ago
This is a super helpful and productive comment. I look forward to a blog post describing your process in more detail.
- oblio 3 days ago
  
  This dead internet uncanny (sarcasm?) valley is killing me.
  
  3 replies →

NobleLie 3 days ago

Yep with a human in the loop to process these larger sprawling plan docs (inflated with the intent of the designer iteratively)

Some get deleted from repo others archived, others merged or referenced elsewhere. It's kind of organic.

dakolli 3 days ago

[flagged]

abustamam 3 days ago
> Haven't seen a single useful thing produced by this garbage process you describe
By using it first-hand or by a colleague? And useful to whom, you, or the person writing it? There are plenty of people in this thread who have actually used this "garbage process," myself included, to produce stuff we, and our colleagues, find is useful.
- dakolli 3 days ago
  
  [flagged]
  
  2 replies →
nubg 3 days ago
What genuinely new thing have you produced?
- dakolli 3 days ago
  
  Well I'm actually producing, not having an llm do things for me and frying my brain in the process. If you're building things with the process described above you're not producing anything, Dalio or Altman's GPUs are, you're simply just a slot machine user.
  Have fun paying for "Think for me Saas".
  2025-2026: The year everyone became the mental equivalent of obese and let their brain atrophy. There are no shortcuts in life that don't come at a huge cost, remember how everyone forgot how to navigate without a maps app, that's going to be you with writing code/reading code/thinking about code.
  
  2 replies →