Comment by jarjoura

2 months ago

As of Dec 2025, Sonnet/Opus and GPTCodex are both trained and most good agent tools (ie. opencode, claude-code, codex) have prompts to fire off subagents during an exploration (use the word explore) and you should be able to Research without needing the extra steps of writing plans and resetting context. I'd save that expense unless you need some huge multi-step verifiable plan implemented.

The biggest gotcha I found is that these LLMs love to assume that code is C/Python but just in your favorite language of choice. Instead of considering that something should be written encapsulated into an object to maintain state, it will instead write 5 functions, passing the state as parameters between each function. It will also consistently ignore most of the code around it, even if it could benefit from reading it to know what specifically could be reused. So you end up with copy-pasta code, and unstructured copy-pasta at best.

The other gotcha is that claude usually ignores CLAUDE.md. So for me, I first prompt it to read it and then I prompt it to next explore. Then, with those two rules, it usually does a good job following my request to fix, or add a new feature, or whatever, all within a single context. These recent agents do a much better job of throwing away useless context.

I do think the older models and agents get better results when writing things to a plan document, but I've noticed recent opus and sonnet usually end up just writing the same code to the plan document anyway. That usually ends up confusing itself because it can't connect it to the code around the changes as easily.

14 comments

jarjoura

coldtea 2 months ago

>Instead of considering that something should be written encapsulated into an object to maintain state, it will instead write 5 functions, passing the state as parameters between each function.

Sounds very functional, testable, and clean. Sign me up.

the_sleaze_ 2 months ago
I know this is tongue in cheek, but writing functional code in an object oriented language, or even worse just taking a giant procedural trail of tears and spreading it across a few files like a roomba through a pile of dog doo is ... well.. a code smell at best.
I have a user prompt saved called clean code to make a pass through the changes and remove unused, DRY and refactor - literally the high points of uncle bob's Clean Code. It works shockingly well at taking AI code and making it somewhat maintainable.
- KptMarchewa 2 months ago
  
  >I know this is tongue in cheek, but writing functional code in an object oriented language, or even worse just taking a giant procedural trail of tears and spreading it across a few files like a roomba through a pile of dog doo is ... well.. a code smell at best.
  After forcing myself over years to apply various OOP principles using multiple languages, I believe OOP has truly been the worst thing to happen to me personally as engineer. Now, I believe what you actually see is just an "aesthetics" issue, moreover it's purely learned aesthetics.
- zx8080 2 months ago
  
  Does its output follow the "no comments needed" principle of the uncle Bob?
- coldtea 2 months ago
  
  Not so much tongue in cheek, but a little on the light side, sure.
  I'd argue writing functional code in C++ (which is multi-paradigm anyway), or Java, or Typescript is fine!
- boredtofears 2 months ago
  
  Care to share the prompt? Sounds useful!
  
  1 reply →

nextaccountic 2 months ago

> As of Dec 2025, Sonnet/Opus and GPTCodex are both trained and most good agent tools (ie. opencode, claude-code, codex) have prompts to fire off subagents during an exploration (use the word explore) and you should be able to Research without needing the extra steps of writing plans and resetting context. I'd save that expense unless you need some huge multi-step verifiable plan implemented.

Does the UI shows clearly what portion was done by a subagent?

xnorswap 2 months ago

Yes it will, this is almost verbatim (redacted product) claude-code output from my current session:

   ● I'll explore the codebase to understand the current <redacted> architecture, testing patterns, and integration points. This will help me formulate effective strategies for reducing QA burden.

   ● 3 Explore agents finished (ctrl+o to expand)
      ├─ Explore <redacted> architecture · 57 tool uses · 60.0k tokens
      │  ⎿  Done
      ├─ Explore current testing approach · 29 tool uses · 51.7k tokens
      │  ⎿  Done
      └─ Explore API integration patterns · 44 tool uses · 71.7k tokens
         ⎿  Done

During agent execution, it also shows what each sub-agent is up to. In ctrl+o mode it'll show the prompts it passed to each sub-agent.

master_crab 2 months ago
The UI (terminal) in Claude code will tell you if it has launched a subagent to research a particular file or problem. But it will not be highlighted for you, simply displayed in its record of prompts and actions.
- mstank 2 months ago
  
  If you use the vscode extension you can click to view the sub-agent prompts and see all tool calls.

je42 2 months ago

If claude ignores your claude.md you can force it to read via settings to cat it every session start for example.

dboreham 2 months ago

AI can be an FP absolutist too.

indigodaddy 2 months ago

Interesting, for me they almost always assume/write TS.