← Back to context

Comment by yesensm

12 hours ago

I’m curious whether anyone has measured this systematically. Right now most of the evidence for multi-agent setups still feels anecdotal.

9 comments

yesensm

Reply

not_ai 8 hours ago

And expensive, exactly the way a pay per use product would push its customers…

“It’s not working well enough!” We tell them. They respond with “Have you tried using it more?”

3yr-i-frew-up 6 hours ago
Back in 2024 I read a study saying: "Ask 4 LLMs the same question, if they all give you the same answer there is some 95-99% chance its correct"
Soooo... Its not just greed. There is something there.
- axldelafosse 5 hours ago
  
  Yes exactly. I’m talking about this in the article. I found out that when Claude and Codex both review the same PR and both find the same issue, our team fixes it 100% of the time.
- zombot 3 hours ago
  
  What's the point of pair programming then if they both have the same opinions?
  
  2 replies →
shafyy 6 hours ago
Haha yeah... Wait until they start jacking up the subscription prices
- observationist 4 hours ago
  
  They don't change the prices, they just modify the amount of compute allocated - slower speeds and fewer tokens, they can set everything in the background to optimize costs and returns, and the user never realizes anything has changed.
  Sometimes they'll announce the changes, and they'll even try to spin it as improving services or increasing value.
  Local AI capabilities are improving at a rapid pace, at some point soon we'll have an RWKV or a 4B LLM that performs at a GPT-5 level, with reasoning and all the bells and whistles, and hopefully that'll shake out most of the deceptive and shady tactics the big platforms are using.

stackgrid 10 hours ago

Completely with you on this! But then we need to define the cirteria for comparison. Might not be that easy unfortunately