If the details of the "hashing" scheme used is publicized, I imagine it will be near trivial. It's a long-standing problem in computer vision, to find a digital description of an image such that two similar images compare equal or at least similar.
State-of-the-art for this field is deep learning, and a /huge/ problem with the DL approach is that you can generate adversarial examples. So for example, a picture of a teacup that is identified by /most/ networks as a dog. It's particularly damning, because it seems like you don't have to do this for particular deep networks, they get tricked the same way, so to speak.
Indeed, at which point we’ll know if Apple has implemented an obviously broken solution which opens us up to egregious government surveillance, or whether that is all just speculation without a factual basis.
This isn't cryptographic though. That would make the entire database absolutely trivial to bypass with tiny imperceptible random changes to the images.
It cannot be a cryptographically secure hash, simply because avoiding detection would then be trivial: change one channel in one pixel by one. Imperceptible change, different cryptographic hash.
The probability of Apple and all its devices winking out of existence due to quantum fluctuations is not zero. ‘Not zero’ is effectively zero if the number is small enough.
128 bit hashes are “expect your first collision when each human buys 2305 iPhones, all with one terabyte storage, and then fills them up with photos that are an average of 1MB in file size”
If the details of the "hashing" scheme used is publicized, I imagine it will be near trivial. It's a long-standing problem in computer vision, to find a digital description of an image such that two similar images compare equal or at least similar.
State-of-the-art for this field is deep learning, and a /huge/ problem with the DL approach is that you can generate adversarial examples. So for example, a picture of a teacup that is identified by /most/ networks as a dog. It's particularly damning, because it seems like you don't have to do this for particular deep networks, they get tricked the same way, so to speak.
Since this algorithm presumably runs on-device, I am sure it won’t be long before someone has reverse engineered it…
Indeed, at which point we’ll know if Apple has implemented an obviously broken solution which opens us up to egregious government surveillance, or whether that is all just speculation without a factual basis.
If it’s a cryptographic hash - very hard.
This isn't cryptographic though. That would make the entire database absolutely trivial to bypass with tiny imperceptible random changes to the images.
It's a perceptual hash.
It cannot be a cryptographically secure hash, simply because avoiding detection would then be trivial: change one channel in one pixel by one. Imperceptible change, different cryptographic hash.
The tweets talk about perceptual hashes, not cryptographic hashes.
But the probability is still not zero, and the number of iPhones in the world is large. A hash collision is possible, however unlikely.
The probability of Apple and all its devices winking out of existence due to quantum fluctuations is not zero. ‘Not zero’ is effectively zero if the number is small enough.
128 bit hashes are “expect your first collision when each human buys 2305 iPhones, all with one terabyte storage, and then fills them up with photos that are an average of 1MB in file size”
https://en.wikipedia.org/wiki/Birthday_attack
http://www.wolframalpha.com/input/?i=2%5E64%20%2F%20%288e9%2...