← Back to context

Comment by Enginerrrd

4 years ago

Wow, I'd guess the opposite when we consider the base rate. Seems like a classic Bayesian problem.

It's fairly normal for parents to take pictures of their children naked in the bathtub/at the beach/ camping/etc.

Conversely, I'd expect actual pedophiles and CSAM producers to be really quite rare.

So even a relatively low base-rate of normal parents with normal nude photos would likely dwarf CSAM upon detection.

So, if we say 1/100 are pedophiles, and 30/100 are parents, and of the parents 10% have such photos, the ratio I'd expect without getting into detection rates is like 3:1 in favor of normal parents.

> It's fairly normal for parents to take pictures of their children naked in the bathtub/at the beach/ camping/etc.

And what I understand is that each of these private photos of their children that parents take on an Android phone with Google cloud enabled gets specifically flagged to be shown to a stranger working for Google.

That sounds pretty insane to me.

  • What's more a "professional" CP distributor would at the least encrypt or compress their photos, and probably not even use Gmail, maybe use a hijacked cracked account.* So that would lower the "true" positive rate even further.

    *In fact that seems like it could be an effective form of extortion or swatting. Someone with access to your credentials could (threaten to) send CP over your account and ruin your life. Or destroy a small business.

  • This quite the insane realization, and it needs to be shouted from the rooftops every time the encryption "debate" comes up. Authoritarians browbeat us by overstating the diffuse possibility of harm, while whitewashing the effects of their own actions. Meanwhile their actions create centralized, institutionalized, inescapable harm - specifically here, creating a ripe target for pedophiles to remotely creep on everyday families just trying to go about their lives.

If the types of photos you mention were getting flagged as per the case in the article, we would be hearing about a lot more cases. The case itself involved a medical photo, not smiling children at the beach. As such the denominator you use isn’t relevant here.

There is a lot of child pornography in image sharing platforms. Facebook, which reports most reliably, had 20 million CSAM reports in 2021, and that’s photos posted to a social network or sent via messenger, not even in private albums. As I understand it CSAM is more focused on reporting re-uploads or transmission of known abuse material, rather than new material. So I still maintain that if we take the type of image referenced in the article as the denominator, vastly more of such photos would be child abuse material. Granted, 1:100000 is too high, I would revise that down to 1:1000.

  • Some fair points, but I have a couple of counter points...

    One, We really need to know the base rates and false positive rates to reason well here.

    Two, I think you'd also need to consider the number of people and number of photos. So like, a photo can be considered a trial, OR flagging a person can. I would imagine a given pedophile has a large number of CSAM photos, but there are few pedophiles. There are however lots of normal parents with fewer photos that might trigger a false positive, but because of that base-rate issue, even if the probability that a given photo is falsely identified as CSAM is quite low, the probability that a given person is falsely flagged/reported may NOT be low.

  • Also, according to the article, Google trained the AI on nude bathtub photos so that they wouldn't (or at least shouldn't) get flagged.