In parallel with our last report, we've been working on a collaboration with Thorn on the spike in computer-generated CSAM, largely due to Stable Diffusion and its emergent ecosystem. We go over how we got here, what new tech is enabling it and where things go from here.

cyber.fsi.stanford.edu/io/publ

Essentially, since Stable Diffusion 1.5 dropped, a community has formed around generating adult content with it, with extensive effort put into training models and model augmentations that make it super easy to make realistic adult content. The underlying tech has advanced *really* rapidly in 2023, greatly outperforming the base SD model. Producing fully realistic content still takes some work, but it's getting easier fast.

You may notice that some of these adult content models will recommend putting "child" in your negative prompt, which should tell you exactly what the problem is: along with new models and augmentations to generate increasingly specific and more realistic explicit content, there's a sub-community of people using it to create CSAM.

Some disturbing "SFW" samples are present on Twitter and Instagram, and worse is well known to be prevalent on the "free speech absolutist" part of the Fediverse. Thorn has worked with LE to confirm that a small but growing percentage of this content is basically indistinguishable to the casual observer from reality, and some models have been trained to generate content featuring actual victims.

Because SD is now fairly convenient to run locally, there are no guardrails in place. Generating, fine-tuning and retraining will only get faster. Even basic stuff like SD watermarks have been stripped out of the biggest implementation, so we're approaching a scenario where NCMEC et al are dealing with CG-CSAM that they can't differentiate and don't know if there's a victim to find.

There are some ways, albeit limited, to mitigate this. Newer model checkpoints can be trained against producing children at all, something that would be a fine standard practice for anything capable of producing explicit content. Platforms distributing models can require this. More research into keyed visual content watermarking and generated content detection would be a great help — something that should have accompanied the release of generative models, not bolted on afterward.

Follow

@det not sure if this is helpful beyond my when-you-hold-a-hammer viewpoint, but dropping it here and I apologise if it is not.

Over the years I've had an important role in an open brainstorming toolkit for prevention and/or . Basically it's a guided website where problems are described and users are involved in about solutions for them.

It is based on a rather sophisticated model of crime prevention, called the conjunction of criminal opportunity that features 11 contributing factors (the conventional crime prevention triangle features 3) from crime promoters, through target enclosure, offender's assessment of risks to criminal predisposition.

The toolkit is freely available at cco.works/ and is usable at no costs whatsoever.

If you (or anyone) thinks it might be useful to promote a problem-solving discussion, we can easily include a scenario exemplifying the problem to serve as a starter.

Let me know if of any interest at all. We've been working on this for more than a decade now.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.