← Back to context

Comment by zythyx

15 hours ago

Nah, n=1 is enough to give evidence that something is entirely broken, of course.

/s

Well, when we had deterministic tools, it would only take a single example of a calculator claiming 1+1=4 for me to throw it in the trash.

  • And if you can come up with a deterministic tool that can do everything LLMs can then that would be amazing! Until then, we have to accept the non-determinism.