Comment by mirekrusin
3 hours ago
Actually, nobody said the spec needs to be written by humans.
My personal opinion: with today's LLMs, the spec should be steered by a human because its quality is proportional to result quality. Human interaction is much cheaper at that stage — it's all natural language that makes sense. Later, reasoning about the code itself will be harder.
In general, any non-trivial, valuable output must be based on some verification loop. A spec is just one way to express verification (natural language — a bit fuzzy, but still counts). Others are typecheckers, tests, and linters (especially when linter rules relate to correctness, not just cosmetics).
Personally, on non-trivial tasks, I see very good results with iterative, interactive, verifiable loops:
- Start with a task
- Write spec in e.g. SPEC.md → "ask question" until answer is "ok"/proceed
- Write implementation PLAN.md — topologically sorted list of steps, possibly with substeps → ask question
- For each step: implement, write tests, verify (step isn't done until tests pass, typecheck passes, etc.); update SPEC/PLAN as needed → ask question
- When done, convert SPEC.md and PLAN.md into PR description (summary) and discard
("Ask question" means an interactive prompt that appears for the user. Each step is gated by this prompt — it holds off further progress, giving you a chance to review and modify the result in small bits you can actually reason about.) The workflow: you accept all changes before confirming the next step. This way you get code deltas that make sense. You can review and understand them, and if something's wrong you can modify by hand (especially renames, which editors like VS Code handle nicely) or prompt for a change. The LLM is instructed to proceed only when the re-asked answer is "ok".
This works with systems like VSCode Copilot, not so much with CC cli.
I'm looking forward to an automated setup where the "human" is replaced by an "LLM judge" — I think you could already design a fairly efficient system like this, but for my work LLMs aren't quite there yet.
That said, there's an aspect that shouldn't be forgotten: this interactive approach keeps you in the driving seat and you know what's happening with the codebase, especially if you're running many of these loops per day. Fully automated solutions leave you outside the picture. You'll quickly get disconnected from what's going on — it'll feel more like a project run by another team where you kind of know what it does on the surface but have no idea how. IMO this is dangerous for long-term, sustainable development.
No comments yet
Contribute on Hacker News ↗