Comment by Brian_K_White

14 hours ago

I hate to help provide possible soultions to an entire process I don't approve of, but maybe the fuzzy tools need old style deterministic tools the same way and for the same reasons we do.

So instead of an LLM trying to answer a math or reason question by finding a statistical match with other similar groups of words it found on 4chan and the all in podcast and a terrible recipe for soup written by a terrible cook, it can use a calculator when it needs a calculator answer.

11 comments

Brian_K_White

cootsnuck 12 hours ago

They absolutely need deterministic tools. What you just described is exactly how the current popular AI agents work. They use "harnesses", which to me is just a rebranding of what we have known all along about building useful and reliable software...composable orchestrated systems with a variety of different pieces selected based on their capabilities and constraints being glued together for specific outcomes.

It just feels like for some reason this is all being relearned with LLMs. I guess shortcuts have always been tempting. And the idea of a "digital panacea" is too hard to resist.

stevula 13 hours ago

I think that is how the smarter agents do things? Just like Claude/ChatGPT sometimes does a web search they can do other tool calls instead of just making a statistical guess. Of course it doesn’t always make the bright choice between those options though…

fipar 12 hours ago

They will also lie and produce output saying it is based on tool execution, without having actually used the tool.
Yes, another layer to cross-check, say, “in kubectl logs I see …” with an actual k8s tool call can help, that is, when the cross-check layer doesn’t lie.
For the time being, IMHO, human validation in key points is the only way to get good results. This is why the tools make experienced people potentially a lot more efficient (they are quick to spot errors/BS) and inexperienced people potentially more dangerous (they’re more prone to trusting the responses, since the tone is usually very professionally sounding).
WalterBright 13 hours ago
> it doesn’t always make the bright choice
I'm available for a small fee.
- sgc 12 hours ago
  
  You must be living in absolute opulence :)
- TeamGTN 2 hours ago
  
  You should raise your price

analog31 12 hours ago

Doesn't agentic AI do this? I've got AI running in VS Code. If I ask it for something, it can fill a code cell with a little bit of Python, and then run it with my approval. It's using the Python interpreter on my computer as a calculator.

epcoa 12 hours ago

That’s exactly how all the current cloud chat bots and agents work now.

colechristensen 14 hours ago

No, they just need to be trained to have adversarial self review "thinking" processes.

You ask an LLM "What's wrong with your answer?" and you get pretty good results.

binary0010 14 hours ago
Or you get the original output result was perfect and the adversarial "rethinking" switches to an incorrect result.
- byzantinegene 13 hours ago
  
  this seems to happen far more than i would like