← Back to context

Comment by icepush

16 hours ago

Did you ask the question several times in fresh chat contexts to see if it sometimes gives the right answer ?

Nah, n=1 is enough to give evidence that something is entirely broken, of course.

/s

  • Well, when we had deterministic tools, it would only take a single example of a calculator claiming 1+1=4 for me to throw it in the trash.

    • And if you can come up with a deterministic tool that can do everything LLMs can then that would be amazing! Until then, we have to accept the non-determinism.