Comment by aidenn0

2 years ago

> This sounds very powerful! And intuitively doesn't make any sense to me. So, say I have n choices, I do not know which choice is better. Does this result say that it's more efficient to randomly choose k and find the best among them (even for k=2), instead of randomly choosing a single choice? Could someone point me in the right direction?

Yes, let's consider the k=2 vs selecting a random item from a set of N items:

Selecting one item uniformly randomly from a set of N is identical to selecting two distinct items and then picking one of those two uniformly randomly.

So if you select items A and B, then pick the best item, you end up with an item of MAX(A,B) utility instead of MEAN(A,B) utility. MAX(A,B) >= MEAN(A,B) should hopefully be obvious.

6 comments

aidenn0

mjevans 2 years ago

Random numbers sound possibly expensive for 'small' caches. Though I /offhand/ I guess if there's a hardware RNG source it doesn't need to be cryptographically secure or trusted as long as it's fast for this task.

Similarly if updates are bursty an idle cull could use a bitvector to mark free slots within a fixed array for rather low overhead and a burst of speedy inserts when load picks up. Something close is probably already necessary overhead for when the cache isn't yet populated.

girvo 2 years ago

You've explained that wonderfully, thank you! I'm not the original commenter, but was also confused by my intuition here. That makes a lot more sense now :)

waynesonfire 2 years ago
It didnt make any sense to me.
I think an intuitive perspective is that if you pick one item randomly, you cant use the LRU signal. So, you pick two items. Now you get to also use LRU. Notice that it doesnt add any value to pick three random items and then apply LRU.
- aidenn0 2 years ago
  
  > Notice that it doesnt add any value to pick three random items and then apply LRU.
  It actually does. The more items you pick, the better it works, because at the ultimate stage of k=N you will always be picking the best item. For a lot of real-world distributions, there are rapidly diminishing returns with higher k, but the returns are still there.
  [edit]
  The same example as above works for k=3, observing that discarding one of the 3 chosen uniformly randomly devolves to the k=2 form.
  k=3: MAX(A,B,C)
  k=2: MEAN(MAX(A,B),MAX(B,C),MAX(A,C))
  k=1:
  
  2 replies →