Comment by Gatsky
4 years ago
If the types of photos you mention were getting flagged as per the case in the article, we would be hearing about a lot more cases. The case itself involved a medical photo, not smiling children at the beach. As such the denominator you use isn’t relevant here.
There is a lot of child pornography in image sharing platforms. Facebook, which reports most reliably, had 20 million CSAM reports in 2021, and that’s photos posted to a social network or sent via messenger, not even in private albums. As I understand it CSAM is more focused on reporting re-uploads or transmission of known abuse material, rather than new material. So I still maintain that if we take the type of image referenced in the article as the denominator, vastly more of such photos would be child abuse material. Granted, 1:100000 is too high, I would revise that down to 1:1000.
Some fair points, but I have a couple of counter points...
One, We really need to know the base rates and false positive rates to reason well here.
Two, I think you'd also need to consider the number of people and number of photos. So like, a photo can be considered a trial, OR flagging a person can. I would imagine a given pedophile has a large number of CSAM photos, but there are few pedophiles. There are however lots of normal parents with fewer photos that might trigger a false positive, but because of that base-rate issue, even if the probability that a given photo is falsely identified as CSAM is quite low, the probability that a given person is falsely flagged/reported may NOT be low.
Also, according to the article, Google trained the AI on nude bathtub photos so that they wouldn't (or at least shouldn't) get flagged.