Comment by mjr00
4 hours ago
> Break down sessions into separate clear, actionable tasks. Don't try to "draw the owl" in one mega session.
This is the key one I think. At one extreme you can tell an agent "write a for loop that iterates over the variable `numbers` and computes the sum" and they'll do this successfully, but the scope is so small there's not much point in using an LLM. On the other extreme you can tell an agent "make me an app that's Facebook for dogs" and it'll make so many assumptions about the architecture, code and product that there's no chance it produces anything useful beyond a cool prototype to show mom and dad.
A lot of successful LLM adoption for code is finding this sweet spot. Overly specific instructions don't make you feel productive, and overly broad instructions you end up redoing too much of the work.
This is actually an aspect of using AI tools I really enjoy: Forming an educated intuition about what the tool is good at, and tastefully framing and scoping the tasks I give it to get better results.
It cognitively feels very similar to other classic programming activities, like modularization at any level from architecture to code units/functions, thoughtfully choosing how to lay out and chunk things. It's always been one of the things that make programming pleasurable for me, and some of that feeling returns when slicing up tasks for agents.
I agree that framing and scoping tasks is becoming a real joy. The great thing about this strategy is there's a point at which you can scope something small enough that it's hard for the AI to get it wrong and it's easy enough for you as a human to comprehend what it's done and verify that it's correct.
I'm starting to think of projects now as a tree structure where the overall architecture of the system is the main trunk and from there you have the sub-modules, and eventually you get to implementations of functions and classes. The goal of the human in working with the coding agent is to have full editorial control of the main trunk and main sub-modules and delegate as much of the smaller branches as possible.
Sometimes you're still working out the higher-level architecture, too, and you can use the agent to prototype the smaller bits and pieces which will inform the decisions you make about how the higher-level stuff should operate.
[Edit: I may have been replying to another comment in my head as now I re-read it and I'm not sure I've said the same thing as you have. Oh well.]
I agree. This is how I see it too. It's more like a shortcut to an end result that's very similar (or much better) than I would've reached through typing it myself.
The other day I did realise that I'm using my experience to steer it away from bad decisions a lot more than I noticed. It feels like it does all the real work, but I have to remember it's my/our (decades of) experience writing code playing a part also.
I'm genuinely confused when people come in at this point and say that it's impossible to do this and produce good output and end results.
I feel the same, but, also, within like three years this might look very different. Maybe you'll give the full end-to-end goal upfront and it just polls you when it needs clarification or wants to suggest alternatives, and it self-manages cleanly self-delegating.
Or maybe something quite different but where these early era agentic tooling strategies still become either unneeded or even actively detrimental.
> it just polls you when it needs clarification
I think anyone who has worked on a serious software project would say, this means it would be polling you constantly.
Even if we posit that an LLM is equivalent to a human, humans constantly clarify requirements/architecture. IMO on both of those fronts the correct path often reveals itself over time, rather than being knowable from the start.
So in this scenario it seems like you'd be dealing with constant pings and need to really make sure you're understanding of the project is growing with the LLM's development efforts as well.
To me this seems like the best-case of the current technology, the models have been getting better and better at doing what you tell it in small chunks but you still need to be deciding what it should be doing. These chunks don't feel as though they're getting bigger unless you're willing to accept slop.
> On the other extreme you can tell an agent "make me an app that's Facebook for dogs" and it'll make so many assumptions about the architecture, code and product that there's no chance it produces anything useful beyond a cool prototype to show mom and dad.
Amusingly, this was my experience in giving Lovable a shot. The onboarding process was literally just setting me up for failure by asking me to describe the detailed app I was attempting to build.
Taking it piece by piece in Claude Code has been significantly more successful.
And lately, the sweet spot has been moving upwards every 6-8 weeks with the model release cycle.
so many times I catch myself asking a coding agent e.g “please print the output” and it will update the file with “print (output)”.
Maybe there’s something about not having to context switch between natural language and code just makes it _feel_ easier sometimes
Exactly. The LLMs are quite good at "code inpainting", eg "give me the outline/constraints/rules and I'll fill-in the blanks"
But not so good at making (robust) new features out of the blue
I actually enjoy writing specifications. So much so that I made it a large part of my consulting work for a huge part of my career. SO it makes sense that working with Gen-AI that way is enjoyable for me.
The more detailed I am in breaking down chunks, the easier it is for me to verify and the more likely I am going to get output that isn't 30% wrong.