Comment by alistairSH
1 day ago
It reads like he had a circular prompt process running, where multiple instances of Claude were solving problems, feeding results to each other, and possibly updating each other's control files?
1 day ago
It reads like he had a circular prompt process running, where multiple instances of Claude were solving problems, feeding results to each other, and possibly updating each other's control files?
They were trying to optimize a CLAUDE.md file which belonged to a project template. The outer Claude instance iterated on the file. To test the result, the human in the loop instantiated a new project from the template, launched an inner Claude instance along with the new project, assessed whether inner Claude worked as expected with the CLAUDE.md in the freshly generated project. They then gave the feedback back to outer Claude.
So, no circular prompt feeding at all. Just a normal iterate-test-repeat loop that happened to involve two agents.
What would be bad in that?
Writing the best possible specs for these agents seems the most productive goal they could achieve.
I think the idea is fine, but what might end up happening is that one agent gets unhinged and "asks" another agent to do more and more crazy stuff, and they get in a loop where everything gets flagged. Remember that "bots configured to add a book at +0.01$ on amazon, reached 1M$ for the book" a while ago. Kinda like that, but with prompts.
I still don't get it, get your models better for this far fetched case, don't ban users for a legitimate use case.
Nothing necessarily or obviously bad about it, just trying to think through what went wrong.
Could anyone explain to me what the problem is with this? I thought I was fairly up to date on these things, but this was a surprise to me. I see the sibling comment getting downvoted but I promise I'm asking this in good faith, even if it might seem like a silly question (?) for some reason.
From what I'm reading in other comments, the problem was Claude1 got increasingly "frustrated" with Claude2's inability to do whatever the human was asking, and started breaking it's own rules (using ALL CAPS).
Sort of like MS's old chatbot that turned into a Nazi overnight, but this time with one agent simply getting tired of the other agent's lack of progress (for some definition of progress - I'm still not entirely sure what the author was feeding into Claude1 alongside errors from Claude2).