Comment by mjb

2 years ago

Two random choices has a few significant advantages over "find the least filled one":

- It's O(1), rather than O(buckets).

- It's robust to stale data and concurrency. One way systems try to avoid the O(buckets) work is by caching, or by scanning in parallel, either of which cause stale load estimates. "Pick the best" tends to lead to significant over-filling http://www.eecs.harvard.edu/~michaelm/postscripts/handbook20... or https://brooker.co.za/blog/2012/01/17/two-random.html).

- Best-of-k (rather than best-of-two) provides a parameter `k` that allows the system to tradeoff between optimality (higher k is more optimal), and robustness to stale data and concurrent accesses (lower k is more robust). Mitzenmacher's core observation is that the step up from k=1 to k=2 is a massive improvement in optimality with only a small loss in robustness (and much more than most people would expect).