Comment by waynesonfire
2 years ago
If you think you need a better K, because you have insight into your distribtution, then don't use uniform, instead, use something more appropriate.
When k=N you degrade to LRU.
Your MAX / MEAN logic is completely flawed. If k=N then you'd have MAX(0..N), wouldn't this be optimal? But, this is LRU and it's not optimal.
All this is hand-waving. The proper response is to use the language of mathematics to prove how this algorithm behaves. Perhaps there is an optimal k.
I didn't realize we had moved on to talking about cache-replacement rather than putting balls in buckets. For putting balls in buckets (or any case where it is possible to compare options and come out with a "best" then k=N is optimal.
For cache-replacement strategies, you can prove very little for the general case, since for any two reasonable replacement strategies, there is usually an access pattern that favors one over the other. When TFA says "Random and FIFO are both strictly worse than either LRU or 2-random" the context is in running SPEC CPU with an 8-way associative cache. It's not hard to come up with artificial benchmarks that invert that relationship.
Also TFA contradicts your earlier assertion that k=3 is not better than k=2, at least when used with a pseudo-LRU: "Also, we can see that pseudo 3-random is substantially better than pseudo 2-random, which indicates that k-random is probably an improvement over 2-random for the k."