Comment by kierangill

2 months ago

Agreed here. A key theme, which isn’t terribly explicit in this post, is that your codebase is your context.

I’ve found that when my agent flies off the rails, it’s due to an underlying weakness in the construction of my program. The organization of the codebase doesn’t implicitly encode the “map”. Writing a prompt library helps to overcome this weakness, but I’ve found that the most enduring guidance comes from updating the codebase itself to be more discoverable.

4 comments

kierangill

fragmede 2 months ago

> my agent flies off the rails

Which, I've had it delete the entire project including .git out of "shame", so my claude doesn't get permission to run rm anymore.

Codex has fewer levers but it's deleted my entire project twice now.

(Play with fire, you're gonna get burnt.)

CPLX 2 months ago
Wait, what? Can you please describe this shame incident?
Also, I have extremely frequent commits and version control syncs to GitHub and so on as part of the process (including when it's working on documents or things that aren't code) as a way to counteract this.
Although I suppose a sufficiently devious AI can get around those, it seems to not have been a problem.
- ewoodrich 2 months ago
  
  Not OP, and haven't had it flat out rm the entire .git, but I have had Claude get flustered and pull a "Wait, no! what was I thinking? that idea doesn't work at all here, I need to revert that attempt and try something else..."
  .. and then ran a fatally flawed "git checkout" command that wiped out all unstaged changes, which it immediately realized and after flailing around for five minutes trying to undo eventually came back saying "yeah uh so sorry, but... here's the thing..."
  
  1 reply →