Comment by yesensm
12 hours ago
I’m curious whether anyone has measured this systematically. Right now most of the evidence for multi-agent setups still feels anecdotal.
12 hours ago
I’m curious whether anyone has measured this systematically. Right now most of the evidence for multi-agent setups still feels anecdotal.
And expensive, exactly the way a pay per use product would push its customers…
“It’s not working well enough!” We tell them. They respond with “Have you tried using it more?”
Back in 2024 I read a study saying: "Ask 4 LLMs the same question, if they all give you the same answer there is some 95-99% chance its correct"
Soooo... Its not just greed. There is something there.
Yes exactly. I’m talking about this in the article. I found out that when Claude and Codex both review the same PR and both find the same issue, our team fixes it 100% of the time.
What's the point of pair programming then if they both have the same opinions?
2 replies →
Haha yeah... Wait until they start jacking up the subscription prices
They don't change the prices, they just modify the amount of compute allocated - slower speeds and fewer tokens, they can set everything in the background to optimize costs and returns, and the user never realizes anything has changed.
Sometimes they'll announce the changes, and they'll even try to spin it as improving services or increasing value.
Local AI capabilities are improving at a rapid pace, at some point soon we'll have an RWKV or a 4B LLM that performs at a GPT-5 level, with reasoning and all the bells and whistles, and hopefully that'll shake out most of the deceptive and shady tactics the big platforms are using.
Completely with you on this! But then we need to define the cirteria for comparison. Might not be that easy unfortunately