Comment by simonw

2 hours ago

Something I've been trying recently for non-throwaway code is extensive refactoring, without typing any code myself but by closely directing the coding agent.

Prompts like "move the code relating to SQL query analysis into a new file", "look for opportunities to use pytest parametrize to remove duplication in that test", "rename method X to Y".

Early indications are that this is helping a lot with the problem where it's easy to churn out thousands of lines of code and not really have it stick in my head, even if I review every line of it.

Reviewing code and actively refactoring it is less tedious and more mentally engaging than reviewing code without changes.

If this was a human collaborator I'd be worried that I'm just creating busywork for them, but I don't care about busywork for LLMs!

The goal is to produce code that I understand and that I can remember just well enough that I get an updated mental model to help me productively make future decisions about the codebase.

>Prompts like "move the code relating to SQL query analysis into a new file", "look for opportunities to use pytest parametrize to remove duplication in that test", "rename method X to Y".

There’s a lot of overlap there with the sorts of things traditional automated refactoring tools can do approximately instantly, locally, and for free.

  • Yea, when I read about people using AI with prompts like that, my first thought is, "Wow, that's like copy/paste, but instead of Ctrl-C/Ctrl-V, it's round-tripping to a server and using GPUs to do it." What's next? "Claude, rename the function doFoo() to performBar()"?

    • Here's the loop for a successful small refactor (anything beyond a rename that could be handled entirely by an IDE):

      1. Find the code you want to change

      2. Run the tests to confirm that test coverage is good for the starting point

      3. Track down everywhere else that might call or interact with that code

      4. Update the tests (red/green TDD)

      5. Alter the code

      6. Update the things that call the code

      7. Run the tests again

      8. Apply linters/formatters

      9. Address any feedback from linters

      10. Check to see if any documentation needs updating and do that

      11. Land a commit with a descriptive commit message

      I can get all of that done with a coding agent with a single sentence prompt - especially if it's already in a session where it knows that I do "red/green TDD".

      ... and then I can work on something else while the agent is churning through those steps.

      2 replies →

  • Yeah I do find myself leaning back into those tools. For awhile I’d just prompt to rename something. But when it’s my own tokens I’m paying for, I prefer the fast and free option :)

  • Sure, and sometimes the coding agent will even use one of those refactoring tools on my behalf.

    Getting them to run ast-grep is really fun, especially when it saves me from having to memorize that syntax myself.

  • What are some traditional automated refactoring tools that can do stuff like those tasks from the example?

    • ???

      Mature workflows for those kinds of tasks have been mostly ubiquitous across professional-grade engineering tools like those from JetBrains or Visual Studio itself for longee than many people here have even been working in the trade.

      It's clearly not the case for simonw, but much of what many people task AI tools to do foe them are only a novelty for the "VS Code"-type users who stubbornly refused to explore more professional-grade paid tools in the past.

      Yet for many tasks, those mature paid tools provided reliable and efficient features that make the AI approach look like an expensive, slow, and dangerously nondeterministic regression.

      1 reply →

I think the best approach is active code review as the agent does small batches. Or letting it come up with a solution, testing if it passes or fails the desired outcome, then creating a separate fresh project and asking it to rewrite in small parts, and have it explain to you what and why it's doing to achieve each part.

Interesting idea.

It’s almost like a buffer space would be useful for code.

I’ve been using tuicr for agent code reviews and have been enjoying that. I think I’ll try your idea as part of my workflow.