Comment by mananaysiempre
4 days ago
> There are concepts like "k-dimensional equidistribution" etc. etc... where in some ways the requirements of a PRNG are far, far, higher than a cryptographically sound PRNG
Huh? If you can chew through however many gigabytes of the supposed CSPRNG’s output, do some statistics, and with a non-negligible probability tell if the bytes in fact came from the CSPRNG in question or an actual iid random source, then you’ve got a distinguisher and the CSPRNG is broken.
It all comes down to actual specific statistical tests, and how hard they are to break in specific applications.
No CSPRNG is absolutely perfect, no CSPRNG has ever absolutely passed every statistical test thrown at it.
In MCMC, it stresses very different statistical tests than the typical CSPRNG tests.
Every PRNG is absolutely broken if you want to be absolute about it. MCMC and crypto applications push on different aspects where statistical issues will cause application level failures.
See e.g. this paper https://www.cs.hmc.edu/tr/hmc-cs-2014-0905.pdf
(it's not the end all be all, but it's a good survey of why this stuff matters and why it's different)
> no CSPRNG has ever absolutely passed every statistical test thrown at it
As far as I know (admittedly not a high standard), there is no published statistical test that you could run on, for example, a single AES-256-CTR bitstream set up with a random key and IV, running on a single computer, that would be able to tell you with a meaningful likelihood ratio that you were looking at a pseudorandom rather than truly random input before the computer in question broke down. (I’m assuming related-key attacks are out of scope if we’re talking about an RNG for simulation purposes.)
Cryptographic operations when done correctly result in full chaos within the discrete domain (the so-called avalanche effect). Any bias of any kind gives rise to a distinguisher and the primitive is regarded as broken.
One way to imagine what symmetric cryptography does is a cellular automaton that is completely shuffled every iteration. In the case of Keccak/SHA3, that is almost exactly what happens too.
3 replies →
"before the computer in question broke down."
A good MCMC simulation might test that! E.g. say, training a large diffusion model. It takes way more computing power than the average time for a single computer to fail.
Also, the standards of those tests vs. does it bias the statistical model fitted with MCMC are different.
I am aware of tests vs. ChaCha20 here https://www.pcg-random.org/index.html, I am not aware of tests vs. AES-256-CTR.
However at some point, 100x faster performance w/o an exploitable attack vector is also relevant! (though sometimes people find ways).
CSPRNGs are mostly worried about very specific attack vectors, and sure, they're like to be completely unpredictable. But other applications care more about other attack vectors like lack of k-dimensional equiprobability, and that hurts them far more.
The idea that CSPRNGs are the end all and be all of rngs holds CS back.
7 replies →