Comment by chaboud

11 hours ago

The author seems to think they've hit upon something revolutionary...

They've actually hit upon something that several of us have evolved to naturally.

LLM's are like unreliable interns with boundless energy. They make silly mistakes, wander into annoying structural traps, and have to be unwound if left to their own devices. It's like the genie that almost pathologically misinterprets your wishes.

So, how do you solve that? Exactly how an experienced lead or software manager does: you have systems write it down before executing, explain things back to you, and ground all of their thinking in the code and documentation, avoiding making assumptions about code after superficial review.

When it was early ChatGPT, this meant function-level thinking and clearly described jobs. When it was Cline it meant cline rules files that forced writing architecture.md files and vibe-code.log histories, demanding grounding in research and code reading.

Maybe nine months ago, another engineer said two things to me, less than a day apart:

- "I don't understand why your clinerules file is so large. You have the LLM jumping through so many hoops and doing so much extra work. It's crazy."

- The next morning: "It's basically like a lottery. I can't get the LLM to generate what I want reliably. I just have to settle for whatever it comes up with and then try again."

These systems have to deal with minimal context, ambiguous guidance, and extreme isolation. Operate with a little empathy for the energetic interns, and they'll uncork levels of output worth fighting for. We're Software Managers now. For some of us, that's working out great.

Revolutionary or not it was very nice of the author to make time and effort to share their workflow.

For those starting out using Claude Code it gives a structured way to get things done bypassing the time/energy needed to “hit upon something that several of us have evolved to naturally”.

  • It's this line that I'm bristling at: "...the workflow I’ve settled into is radically different from what most people do with AI coding tools..."

    Anyone who spends some time with these tools (and doesn't black out from smashing their head against their desk) is going to find substantial benefit in planning with clarity.

    It was #6 in Boris's run-down: https://news.ycombinator.com/item?id=46470017

    So, yes, I'm glad that people write things out and share. But I'd prefer that they not lead with "hey folks, I have news: we should *slice* our bread!"

    • But the author's workflow is actually very different from Boris'.

      #6 is about using plan mode whereas the author says "The built-in plan mode sucks".

      The author's post is much more than just "planning with clarity".

      5 replies →

    • I would say he’s saying “hey folks, I have news. We should slice our bread with a knife rather than the spoon that came with the bread.”

    • > Anyone who spends some time with these tools (and doesn't black out from smashing their head against their desk) is going to find substantial benefit in planning with clarity.

      That's obvious by now, and the reason why all mainstream code assistants now offer planning mode as a central feature of their products.

      It was baffling to read the blogger making claims about what "most people" do when anyone using code assistants already do it. I mean, the so called frontier models are very expensive and time-consuming to run. It's a very natural pressure to make each run count. Why on earth would anyone presume people don't put some thought into those runs?

  • This kind of flows have been documented in the wild for some time now. They started to pop up in the Cursor forums 2+ years ago... eg: https://github.com/johnpeterman72/CursorRIPER

    Personally I have been using a similar flow for almost 3 years now, tailored for my needs. Everybody who uses AI for coding eventually gravitates towards a similar pattern because it works quite well (for all IDEs, CLIs, TUIs)

  • Its ai written though, the tells are in pretty much every paragraph.

    • I don’t think it’s that big a red flag anymore. Most people use ai to rewrite or clean up content, so I’d think we should actually evaluate content for what it is rather than stop at “nah it’s ai written.”

      22 replies →

    • >the tells are in pretty much every paragraph.

      It's not just misleading — it's lazy. And honestly? That doesn't vibe with me.

      [/s obviously]

    • So is GP.

      This is clearly a standard AI exposition:

      LLM's are like unreliable interns with boundless energy. They make silly mistakes, wander into annoying structural traps, and have to be unwound if left to their own devices. It's like the genie that almost pathologically misinterprets your wishes.

    • Then ask your own ai to rewrite it so it doesn't trigger you into posting uninteresting thought stopping comments proclaiming why you didn't read the article, that don't contribute to the discussion.

Agreed. The process described is much more elaborate than what I do but quite similar. I start to discuss in great details what I want to do, sometimes asking the same question to different LLMs. Then a todo list, then manual review of the code, esp. each function signature, checking if the instructions have been followed and if there are no obvious refactoring opportunities (there almost always are).

The LLM does most of the coding, yet I wouldn't call it "vibe coding" at all.

"Tele coding" would be more appropriate.

  • I use AWS Kiro, and its spec driven developement is exactly this, I find it really works well as it makes me slow down and think about what I want it to do.

    Requirements, design, task list, coding.

I’ve also found that a bigger focus on expanding my agents.md as the project rolls on has led to less headaches overall and more consistency (non-surprisingly). It’s the same as asking juniors to reflect on the work they’ve completed and to document important things that can help them in the future. Software Manger is a good way to put this.

  • AGENTS.md should mostly point to real documentation and design files that humans will also read and keep up to date. It's rare that something about a project is only of interest to AI agents.

It feels like retracing the history of software project management. The post is quite waterfall-like. Writing a lot of docs and specs upfront then implementing. Another approach is to just YOLO (on a new branch) make it write up the lessons afterwards, then start a new more informed try and throw away the first. Or any other combo.

For me what works well is to ask it to write some code upfront to verify its assumptions against actual reality, not just be telling it to review the sources "in detail". It gains much more from real output from the code and clears up wrong assumptions. Do some smaller jobs, write up md files, then plan the big thing, then execute.

  • 'The post is quite waterfall-like. Writing a lot of docs and specs upfront then implementing' - It's only waterfall if the specs cover the entire system or app. If it's broken up into sub-systems or vertical slices, then it's much more Agile or Lean.

  • It makes an endless stream of assumptions. Some of them brilliant and even instructive to a degree, but most of them are unfounded and inappropriate in my experience.

I really like your analogy of LLMs as 'unreliable interns'. The shift from being a 'coder' to a 'software manager' who enforces documentation and grounding is the only way to scale these tools. Without an architecture.md or similar grounding, the context drift eventually makes the AI-generated code a liability rather than an asset. It's about moving the complexity from the syntax to the specification.

I've been doing the exact same thing for 2 months now. I wish I had gotten off my ass and written a blog post about it. I can't blame the author for gathering all the well deserved clout they are getting for it now.

  • I went through the blog. I started using Claude Code about 2 weeks ago and my approach is practically the same. It just felt logical. I think there are a bunch of us who have landed on this approach and most are just quietly seeing the benefits.

  • Don’t worry. This advice has been going around for much more than 2 months, including links posted here as well as official advice from the major companies (OpenAI and Anthropic) themselves. The tools literally have had plan mode as a first class feature.

    So you probably wouldn’t have any clout anyways, like all of the other blog posts.

Oh no, maybe the V-Model was right all the time? And right sizing increments with control stops after them. No wonder these matrix multiplications start to behave like humans, that is what we wanted them to do.

It's nice to have it written down in a concise form. I shared it with my team as some engineers have been struggling with AI, and I think this (just trying to one-shot without planning) could be why.

> LLM's are like unreliable interns with boundless energy

This isn’t directed specifically at you but the general community of SWEs: we need to stop anthropomorphizing a tool. Code agents are not human capable and scaling pattern matching will never hit that goal. That’s all hype and this is coming from someone who runs the range of daily CC usage. I’m using CC to its fullest capability while also being a good shepherd for my prod codebases.

Pretending code agents are human capable is fueling this koolaide drinking hype craze.

  • It’s pretty clear they effectively take on the roles of various software related personas. Designer, coder, architect, auditor, etc…

    Pretending otherwise is counter-productive. This ship has already sailed, it is fairly clear the best way to make use of them is to pass input messages to them as if they are an agent of a person in the role.

> The author seems to think they've hit upon something revolutionary...

> They've actually hit upon something that several of us have evolved to naturally.

I agree, it looks like the author is talking about spec-driven development with extra time-consuming steps.

Copilot's plan mode also supports iterations out of the box, and draft a plan only after manually reviewing and editing it. I don't know what the blogger was proposing that ventured outside of plan mode's happy path.

If you have a big rules file you’re in the right direction but still not there. Just as with humans, the key is that your architecture should make it very difficult to break the rules by accident and still be able to compile/run with correct exit status.

My architecture is so beautifully strong that even LLMs and human juniors can’t box their way out of it.