Comment by alphazard

1 month ago

Every time I read something like this, it strikes me as an attempt to convince people that various people-management memes are still going to be relevant moving forward. Or even that they currently work when used on humans today. The reality is these roles don't even work in human organizations today. Classic "job_description == bottom_of_funnel_competency" fallacy.

If they make the LLMs more productive, it is probably explained by a less complicated phenomenon that has nothing to do with the names of the roles, or their descriptions. Adversarial techniques work well for ensuring quality, parallelism is obviously useful, important decisions should be made by stronger models, and using the weakest model for the job helps keep costs down.

29 comments

alphazard

rlayton2 1 month ago

My understanding is that the main reason splitting up work is effective is context management.

For instance, if an agent only has to be concerned with one task, its context can be massively reduced. Further, the next agent can just be told the outcome, it also has reduced context load, because it doesn't need to do the inner workings, just know what the result is.

For instance, a security testing agent just needs to review code against a set of security rules, and then list the problems. The next agent then just gets a list of problems to fix, without needing a full history of working it out.

fphhotchips 1 month ago
Which, ultimately, is not such a big difference to the reason we split up work for humans, either. Human job specialization is just context management over the course of 30 years.
- miki123211 1 month ago
  
  > Which, ultimately, is not such a big difference to the reason we split up work for humans,
  That's mostly for throughput, and context management.
  It's context management in that no human knows everything, but that's also throughput in a way because of how human learning works.
purplepatrick 1 month ago

I’ve found that task isolation, rather than preserving your current session’s context budget, is where subagents shine.
In other words, when I have a task that specifically should not have project context, then subagents are great. Claude will also summon these “swarms” for the same reason. For example, you can ask it to analyze a specific issue from multiple relevant POVs, and it will create multiple specialized agents.
However, without fail, I’ve found that creating a subagent for a task that requires project context will result in worse outcomes than using “main CC”, because the sub simply doesn’t receive enough context.
XenophileJKO 1 month ago
So two things.. Yes this helps with context and is a primary reason to break out the sub-agents.
However one of the bigger things is by having a focus on a specific task or a role, you force the LLM to "pay attention" to certain aspects. The models have finite attention and if you ask them to pay attention to "all things".. they just ignore some.
The act of forcing the model to pay attention can be acoomplished in alternative ways (defined process, commitee formation in single prompt, etc.), but defining personas at the sub-agent is one of the most efficient ways to encode a world view and responsibilities, vs explicitly listing them.
- lwhi 1 month ago
  
  What do you think context is, if not 'attention'?
  
  4 replies →

simondotau 1 month ago

I suppose it’s could end up being an LLM variant of Conway’s Law.

“Organizations are constrained to produce designs which are copies of the communication structures of these organizations.”

https://en.wikipedia.org/wiki/Conway%27s_law

_kb 1 month ago

If so, one benefit is you can quickly and safely mix up your set of agents (a la Inverse Conway Manoeuvre) without the downsides that normally entails (people being forced to move teams or change how they work).

miki123211 1 month ago

I think it's just the opposite, as LLMs feed on human language. "You are a scrum master." Automatically encodes most of what the LLM needs to know. Trying to describe the same role in a prompt would be a lot more difficult.

Maybe a different separation of roles would be more efficient in theory, but an LLM understands "you are a scrum master" from the get go, while "you are a zhydgry bhnklorts" needs explanation.

joshuaisaact 1 month ago
This has been pretty comprehensively disproven:
https://arxiv.org/abs/2311.10054
Key findings:
-Tested 162 personas across 6 types of interpersonal relationships and 8 domains of expertise, with 4 LLM families and 2,410 factual questions
-Adding personas in system prompts does not improve model performance compared to the control setting where no persona is added
-Automatically identifying the best persona is challenging, with predictions often performing no better than random selection
-While adding a persona may lead to performance gains in certain settings, the effect of each persona can be largely random
Fun piece of trivia - the paper was originally designed to prove the opposite result (that personas make LLMs better). They revised it when they saw the data completely disproved their original hypothesis.
- uxhacker 1 month ago
  
  Persona’s is not the same thing as a role. The point of the role is to limit what the work of the agent, and to focus it on one or two behaviors.
  What the paper is really addressing is does key words like you are a helpful assistant give better results.
  The paper is not addressing a role such as you are system designer, or you are security engineer which will produce completely different results and focus the results of the LLM.
  
  2 replies →
- jimkleiber 1 month ago
  
  How well does such llm research hold up as new models are released?
  
  1 reply →
- dist-epoch 1 month ago
  
  In a discussion about LLMs you link to a paper from 2023, when not even GPT-4 was available?
  And then you say:
  > comprehensively disproven
  ? I don't think you understand the scientific method
  
  3 replies →
- solumunus 1 month ago
  
  One study has “comprehensively disproven” something for you? You must be getting misled left right and centre if that’s how you absorb study results.

ttoinou 1 month ago

Developers do want managers actually, to simplify their daily lives. Otherwise they would self manage themselves better and keep more of the share of revenues for them

shermantanktop 1 month ago

Unfortunately some managers get lonely and want a friendly face in their org meetings, or can’t answer any technical questions, or aren’t actually tracking what their team is doing. And so they pull in an engineer from their team.
Being a manager is a hard job but the failure mode usually means an engineer is now doing something extra.

ljm 1 month ago

It shows me that there doesn’t appear to be an escape from Conway’s Law, even when you replace the people in an organisation with machines. Fundamentally, the problem is still being explored from the perspective of an organisation of people and it follows what we’ve experienced to work well (or as well as we can manage).

generallyjosh 1 month ago

I do think there is some actual value in telling an LLM "you are an expert code reviewer". You really do tend to get better results in the output

When you think about what an LLM is, it makes more sense. It causes a strong activation for neorons related to "code review", and so the model's output sounds more like a code review.

zhenyakovalyov 1 month ago

i guess, as a human it’s easier to reason about a multi-agent system when the roles are split intuitively, as we all have mental models. but i agree - it’s a bit redundant/unnecessary