Comment by prng2021

1 day ago

How is anyone predicting timelines for AGI when these systems can’t do basic addition of 2 arbitrary numbers with 100% accuracy?

Can you do basic addition of 2 arbitrary numbers with 100% accuracy (no tools) ? No you can't. You will make mistakes for a sufficiently large N even with pen and paper, and a very small N without. Are you no longer generally intelligent ?

LLMs should use tool calling (which is 100% reliable) instead of doing math internally. But in general it would be nice to be able to teach a process and have the AI execute it deterministically. In some sense, reliability between 99% and 100% is the worst because you still can't trust the output but the verification feels like wasted effort. Maybe code gen and execution will get us there.

  • This is the exact problem CognOS was built to solve.

      99% reliable means you still can't remove the human from the loop — because you never know which 1% you're in. The only way to actually trust output is to attach a verifiable confidence   
      signal to each response, not just hope the aggregate accuracy holds.                                                                                                                        
                                                                                                                                                                                                
      We built a local gateway that wraps every LLM output with a trust envelope: decision trace, risk score, and an explicit PASS/REFINE/ESCALATE/BLOCK classification. The point isn't to make 
      LLMs more accurate — it's to make their uncertainty legible so the human knows when to step in.
    
      Open source if you want to look at the architecture: github.com/base76-research-lab/operational-cognos

  • "reliability between 99% and 100% is the worst because you still can't trust the output"