← Back to context

Comment by acron0

6 days ago

Without meaning to sound flippant or dismissive, I think you're overthinking it. By the sounds of it, agents aren't offering what you say you need. What are _are_ offering is the boilerplate, the research, the planning etc. All the stuff that's ancillary. You could quite fairly say that it's in the pursuit of this stuff where details and ideas emerge and I would agree, but sometimes you don't need ideas. You need solutions which are run-of-the-mill and boring.

I'm well aware that LLMs are more than capable enough to successfully perform straightforward, boring tasks 90% of the time. The problem is that there's a small but significant enough portion of time where I think a problem is simple and straightforward, but it turns out not to be once you get into the weeds, and if I can't trust the tool to tell me if we're in the 90% problem or the 10% problem, then I have to carefully review everything.

I'm used to working with tools, such as SMT solvers, that may fail to perform a task, but they don't lie about their success or failure. Automation that doesn't either succeed or report a failure reliably is not really automation.

Again, I'm not saying that the work done by the LLM is useless, but the tradeoffs it requires make it dramatically different from how both tools and humans usually operate.