Comment by lynx97

1 day ago

I am surprised such a simple approach has taken so long to be actually used. My first image description cli attempt did basically that: Use n to get several answers and another pass to summarize.

People have played with (multi-) agentic frameworks for LLMs from the very beginning but it seems like only now with powerful reasoning models it is really making a difference.

It's very resource intensive so maybe they had to wait until processes got more efficient? I can also imagine they would want to try and solve it in a... better way before doing this.

I have a similar thing built around a year ago w/ autogen. The difference now is models can really be steered towards "part" of the overall goal, and they actually follow that.

Before this, even the best "math" models were RLd to death to only solve problems. If you wanted it to explore "method_a" of solving a problem you'd be SoL. The model would start like "ok, the user wants me to explore method_a, so here's the solution: blablabla doing whatever it wanted, unrelated to method_a.

Similar things for gathering multiple sources. Only recently can models actually pick the best thing out of many instances, and work effectively at large context lengths. The previous tries with 1M context lengths were at best gimmicks, IMO. Gemini 2.5 seems the first model that can actually do useful stuff after 100-200k tokens.

I agree but I think its hard to get a sufficient increase in performance that would justify 3-4x increase in cost.