Comment by CuriouslyC
10 hours ago
My experience with benchmarks and evals is that it can take ~20 runs of a problem for the distribution of answers to start to converge. Ideally you'd know the convergence properties of your algorithm ahead of time and make a Bayesian solution that makes the uncertainty explicit.
No comments yet
Contribute on Hacker News ↗