Comment by mjb
2 years ago
The balls-into-bins problem comes up often in distribute systems engineering, and so it's super important to have the right intuition. I think a lot of people come in to the space assuming that random balls-into-bins is much more even than it really is (and so underestimating the value of better-than-random placement).
Another interesting example is hash-based partitioning of data. It doesn't spread data out nearly as evenly as one might expect!
(I wrote a blog post about that too: https://brooker.co.za/blog/2018/01/01/balls-into-bins.html)
I have found remembering that both the law of large numbers and the central limit theorem are dependent on the independent and identically distributed assumption.
As an example, many computing systems are Markovian, which is memoryless with an exponential distribution.
So the second part of I.I.D. doesn't hold. But may be close enough for your needs.
However, in many real systems the aggregate behavior follows the asymptotic trend of the central limit theorem even though the underlying distributions may not fit the strict requirements.
Yes, I always start with the IID, but try watch for evidence that those assumptions need to change.
Too many convenient properties are lost once you lose IID to prematurely drop it.