Comment by bob1029

1 month ago

I'm beginning to question the notion that multi agent patterns don't work. I think there is something extra you get with a proposer-verifier style loop, even if both sides are using the same base model.

I've had very good success with a recursive sub agent scheme where a separate prompt (agent) is used to gate the recursive call. It compares the callers prompt with the proposed callee's prompt to determine if we are making a reasonable effort to reduce the problem into workable base cases. If the two prompts are identical we deny the request with an explanation. In practice, this works so well I can allow for unlimited depth and have zero fear of blowing the stack. Even if the verifier gets it wrong a few times, it only has to get it right once to reverse an infinite descent.

2 comments

bob1029

krackers 1 month ago

>I think there is something extra you get with a proposer-verifier style loop, even if both sides are using the same base model.

DeepSeekMath-V2 seems to show this, increasing the number of prover/verifier iterations gives increases accuracy. And this is with a model that has already undergone RL under a prover/verifier selection process.

However this type of subagent communication maintains full context, and is different from "breaking into tasks" style of sharding amongst subagents. I'm less convinced of the latter, because often times a problem is more complex than the sum of its parts, i.e. it's the interdependencies that make it complex and you need to consider each part in relation to the other parts, not in isolation.

bob1029 1 month ago

The specific way in which we invoke the subagents is critical to the performance of the system. If we use a true external call stack and force proper depth first recursion, the effective context can be maintained to whatever depth is desired.
Parallelism and BFS style approaches do not exhibit this property. Anything that happens within the context or token stream is a much weaker solution. Most agent frameworks are interested in appearance of speed, so they miss out on the nuance of this execution model.