Comment by BowBun

1 month ago

Much like Windows threads with people experience strange bugs without knowing any of their workloads and tools, it's impossible to say. We've got a team of 30 using it full time, and as a member of end leadership I would be hearing if it was constantly missing expectations. It did take iterations to get here, as with everything.

Some of the usual suspects when people are getting bad results: * Overbloated claude.md, it should not contain everything, it should be a table of contents pointing to other files * Max effort - why? Overthinking on simpler tasks results in degraded quality, much like in humans. * You speak of your single session but with agents reviewing other agent outputs. Without knowing your goal and your prompt, and what the agents had access to, my first inclination is that the initial request was vague, a bunch of unnecessary info was returned, and your review step caught that extra jank.

I'm not gonna bother making the joke you allude to, but every single employee I've worked with in person has had glaring holes in their setups which, once solved, dramatically reduced stuff like what you're talking about.

0 comments

BowBun

No comments yet

Contribute on Hacker News ↗