Comment by pron

7 days ago

I'm well aware that LLMs are more than capable enough to successfully perform straightforward, boring tasks 90% of the time. The problem is that there's a small but significant enough portion of time where I think a problem is simple and straightforward, but it turns out not to be once you get into the weeds, and if I can't trust the tool to tell me if we're in the 90% problem or the 10% problem, then I have to carefully review everything.

I'm used to working with tools, such as SMT solvers, that may fail to perform a task, but they don't lie about their success or failure. Automation that doesn't either succeed or report a failure reliably is not really automation.

Again, I'm not saying that the work done by the LLM is useless, but the tradeoffs it requires make it dramatically different from how both tools and humans usually operate.