**Mike Kasprzak 🦖** @mike@jammer.social · Mar 27, 2023, 14:41

**Mike Kasprzak 🦖** @mike@jammer.social · Mar 27, 2023, 14:41

Mike Kasprzak 🦖 @mike@jammer.social

Mar 27, 2023, 14:41

Experimenting with #OpenAI's moderation model. It does a really good job of extracting implication from a string of text, picking up hate, violence, and sexual cues.

Unfortunately this doesn't work for spam detection.

Something to explore later might be to see if these cues can tell us how understood or misunderstood something will be based on hate/violence/sex cues. For example, does authority come off as violent, or teasing come off as sexual? 🤔

**Mike Kasprzak 🦖** @mike@jammer.social · Mar 27, 2023, 15:01

**Mike Kasprzak 🦖** @mike@jammer.social · Mar 27, 2023, 15:01

Mar 27, 2023, 15:01

Mike Kasprzak 🦖 @mike@jammer.social

A brief test of that, apparently being more assertive is LESS hateful and violent, but I suspect at this low of a sample size it's just rounding error.

Where as using the word "ass" scores us some sex points, and referring to said ass as "fat" scores us some hate points. It's still a super tiny amount, but if we reword our phrase more politely we don't register any of the metrics.

Neat. #OpenAI

**Mike Kasprzak 🦖** @mike@jammer.social · Mar 27, 2023, 16:21

**Mike Kasprzak 🦖** @mike@jammer.social · Mar 27, 2023, 16:21

Mar 27, 2023, 16:21

Mike Kasprzak 🦖 @mike@jammer.social

I roughly understand how to use #OpenAI
's embeddings API, enough to not use it (yet).

The TL;DR is you feed it a text blob, and you get back a vector with ~1500 components (🤯). By itself the vector is meaningless, but patterns will emerge between similar data.

Example: spam posts should find themselves weighted towards one or more axis (angles?), but you need lots of non-spam data to find it. You could also do general search with it, but IMO it would be overkill (expensive).

Pretty cool tho.

**Mike Kasprzak 🦖** @mike@jammer.social · Mar 27, 2023, 16:29

**Mike Kasprzak 🦖** @mike@jammer.social · Mar 27, 2023, 16:29

Mar 27, 2023, 16:29

Mike Kasprzak 🦖 @mike@jammer.social

With that in mind, I'm going to use Akismet in the near term for detecting and flagging suspicious content.

It's a bit aggressive (simply says yes or no), but that should give some reasonable data to test new user content with to see if they are legit or not. #LDJam

**l'empathie mécanique** @dpwiz@qoto.org · 2023-03-27T19:40:06Z

l'empathie mécanique @dpwiz@qoto.org

@mike Can you roll all the comments through both akismet and embeddings? SVM then should give the good enough result for cheap.

Mar 27, 2023, 19:40 · · · ·

**Mike Kasprzak 🦖** @mike@jammer.social · Mar 27, 2023, 21:31

**Mike Kasprzak 🦖** @mike@jammer.social · Mar 27, 2023, 21:31

Mar 27, 2023, 21:31

Mike Kasprzak 🦖 @mike@jammer.social

@dpwiz Yes I can, and that will probably be where I end up. Akismet doesn't seem to have many false positives, but broadly it doesn't seem to catch much.

The current problem with embeddings is that I have well over 1 million pieces of content that could be scanned. That bill will add up fast, and I'd be wise to keep a copy of the output for future uses, which is potentially another 12 GB of data to store and archive...At least until GPT-3-ADA is deprecated, and compatibility breaks. 😅

Resources

Developers

What is Mastodon?

qoto.org

More…