← Back to context

Comment by defrost

5 hours ago

There's a legally challengable assertion there; "trained on CSAM images".

I imagine an AI image generation model could be readily trained on images of adult soldiers at war and images of children from instagram and then be used to generate imagery of children at war.

I have zero interest in defending exploitation of children, the assertion that children had to have been exploited in order to create images of children engaged in adult activities seems shaky. *

* FWiW I'm sure there are AI models out there that were trained on actual real world CSAM .. it's the implied neccessity that's being questioned here.

It is known that the LAION dataset underpinning foundation models like Stable Diffusion contained at least a few thousand instances of real-life CSAM at one point. I think you would be hard-pressed to prove that any model trained on internet scrapes definitively wasn't trained on any CSAM whatsoever.

https://www.theverge.com/2023/12/20/24009418/generative-ai-i...

  • > I think you would be hard-pressed to prove that any model trained on internet scrapes definitively wasn't trained on any CSAM whatsoever.

    I'd be hard pressed to prove that you definitely hadn't killed anybody ever.

    Legally if it's asserted that these images are criminal because they are the result of being the product of an LLM trained on sources that contained CSAM then the requirement would be to prove that assertion.

    With text and speech you could prompt the model to exactly reproduce a Sarah Silverman monologue and assert that proves her content was used in the training set, etc.

    Here the defense would ask the prosecution to demonstrate how to extract a copy of original CSAM.

    But your point is well taken, it's likely most image generation programs of this nature have been fed at least one image that was borderline jailbait and likely at least one that was well below the line.

    • > Legally if it's asserted that these images are criminal because they are the result of being the product of an LLM trained on sources that contained CSAM then the requirement would be to prove that assertion.

      Legally, possession of CSAM is against the law because there is an assumption that possession proves contribution to market demand, with an understanding that demand incentives production of supply, meaning there that with demand children will be harmed again to produce more content to satisfy the demand. In other words, the intent is to stop future harm. This is why people have been prosecuted for things like suggestive cartoons that have no real-life events behind them. It is not illegal on the grounds of past events. The actual abuse is illegal on its own standing.

      The provenance of the imagery is irrelevant. What you need to prove is that your desire to have such imagery won't stimulate yourself or others to create new content with real people. If you could somehow prove that LLM content will satisfy all future thirst, problem solved! That would be world changing.

      2 replies →

    • Framing it in that way is essentially a get out of jail free card - anyone caught with CSAM can claim it was AI generated by a "clean" model, and how would the prosecution ever be able to prove that it wasn't?

      I get where you are coming from but it doesn't seem actionable in any way that doesn't effectively legalize CSAM possession, so I think courts will have no choice but to put the burden of proof on the accused. If you play with fire then you'd better have the receipts.

      2 replies →

  • I think you'd be hard-pressed to prove that a few thousand images (out of over 5 billion in the case of that particular data set) had any meaningful effect on the final model capabilities.

> There's a legally challengable assertion there; "trained on CSAM images".

"Legally challengable" only in a pretty tenuous sense that's unlikely to ever haven any actual impact.

That'll be something that's recited as a legislative finding. It's not an element of the offense; nobody has to prove that "on this specific occasion the model was trained in this or that way".

It could theoretically have some impact on a challenge to the constitutionality of the law... but only under pretty unlikely circumstances. First you'd have to get past the presumption that the legislature can make any law it likes regardless of whether it's right about the facts (which, in the US, probably means you have to get courts to take the law under strict scrutiny, which they hate to do). Then you have to prove that that factual claim was actually a critical reason for passing the law, and not just a random aside. Then you have to prove that it's actually false, overcoming a presumption that the legislature properly studied the issue. Then maybe it matters.

I may have the exact structure of that a bit wrong, but that's the flavor of how these things play out.

  • My comment was in response to a portion of the comment above:

    > because the machine-learning models utilized by AI have been trained on datasets containing thousands of depictions of known CSAM victims

    I'd argue that CSAM imagery falls into two broad categories; actual real photographic image of actual real abuse and generated images (paintings, drawings, animations, etc) and all generated images are more or less equally bad.

    There's a peer link in this larger thread ( https://en.wikipedia.org/wiki/Legal_status_of_fictional_porn... ) that indicates at least two US citizen have been charged and sentenced for 20 and 40 years imprisonment each for the possession and distribution of "fictional" child abuse (animated and still japanese cartoon anime, etc).

    So, in the wider world, it's a moot point whether these specific images came from training on actual abuse images or not, they depict abuse and that's legally sufficient in the US (apparently), further the same depictions could be generated with or without actual real abuse images and as equivilant images either way they'd be equally offensive.

Exactly. The abundance of AI-generated renditions of Shrimp Jesus doesn't mean it was trained on actual photos of an actual Shrimp Jesus.