← Back to context

Comment by greiskul

17 hours ago

> you can never have too many sources of entropy

This is so true. And the beauty is that with algorithms, we don't even need to know much about the entropy to be able to extract it.

There is the Von Neumann method of generating an unbiased coin from a biased coin. Of throwing it twice, and checking if you got HT or TH. And completely discarding all HH or TT results. It doesn't matter if the coin you are using is 20% or 80%, the result will be a true 50/50.

There are more modern algorithms that can be even better (in that they need less coin tosses if you have a very unbalanced coin).

And then there is modern cryptographic hashing. Feed it all the bits you can. Collisions end up only happening in the real world if every single one of those bits is identical. So if you have actual entropy being fed, that cannot be controlled, predicted, or replicated, modern cryptography tells you that the end result is unique.

> There is the Von Neumann method of generating an unbiased coin from a biased coin. Of throwing it twice, and checking if you got HT or TH. And completely discarding all HH or TT results. It doesn't matter if the coin you are using is 20% or 80%, the result will be a true 50/50.

This blew my mind. Thank you!

I had to think about it a bit, so for anyone scratching their head right now trying to figure it out, consider it this way:

what matters is the ordering, of heads-then-tails, or tails-then-heads.

It doesn't matter that it's biased one way or the other, if you keep flipping pairs until you get a result with two different values, it's a 50/50 chance whether the less-likely result comes first, or second.

You might only have a 20% chance of any particular pair having a tails (for example), but in the cases where you do have a tails, it's a 50/50 chance that it comes first or second.

  • And for people who like equations, here is my attempt at explaining it.

    Assume each flip is independent and the bias remains same in each flip.

    Let

      P(H) = p,
      P(T) = 1 - p.
    

    Then

      P(HH) = p^2,
      P(HT) = p(1 - p),
      P(TH) = (1 - p)p,
      P(TT) = (1 - p)^2.
    

    Therefore

      P(HT or TH) = 2p(1 - p).
    

    Now calculate

      P(HT | HT or TH) = p(1 - p) / (2p(1 - p)) = 1/2,
      P(TH | HT or TH) = (1 - p)p / (2p(1 - p)) = 1/2.

    • You don't need conditional probability here, as the flips are independent.

      It's just p(H)p(T).

      And p(H)p(T) = p(T)p(H), thus 2*p(H)p(T) = 2p(1-p).

  • I was doubting this for a minute as I wondered with a significantly biased coin towards the head side would you be more likely to get HT. With probability problems like Monty Hall I like to think about extreme cases like say it's 99 heads to every 1 tails. You'd expect HT 0.99% of the time. Ditto TH.

    • You can’t flip coins until you get the first different outcome… You have to flip twice each time, until you get a pair with different outcomes.

  • Thanks for your explanation. I did not get it in the first read, and was too lazy to think, until saw your comment.

    Just want to point out, that one is actually doing the experiment with a biased coin, then one must ignore all pairs.

    e.g in case a coin which is heavily biased, say .9 H and .1 T. One should start with ignoring all the HH pairs, and start only at odd index. Lest, one picks a value like HHHHT (in the case the 2nd HH pair was not skipped, instead they greedily picked up the first HT, which will make the experiment HT biased).

  • Afaics it's just basic commutativity – p(H)p(T) = p(T)p(H) – since instances are independent.

    Same, of course, holds for flipping it multiple times. But there you get more than Head or Tail (binomnk(n, k)).