**BobGourley** @BobGourley@defcon.social · Jan 28, 2023, 22:31

**BobGourley** @BobGourley@defcon.social · Jan 28, 2023, 22:31

BobGourley @BobGourley@defcon.social

Jan 28, 2023, 22:31

BobGourley @BobGourley@defcon.social

In the old days it was be hard to program in things to watch for regarding user input, including huge numbers of bad words/phrases. In my GPT-3 based app I just coded in this command: If the request is lewd, tell user that the question is not a suitable topic for this tool.

#infosec #gpt3 #openai #OODA

**ceoln** @ceoln@qoto.org · Jan 28, 2023, 22:32

**ceoln** @ceoln@qoto.org · Jan 28, 2023, 22:32

Jan 28, 2023, 22:32

ceoln @ceoln@qoto.org

@BobGourley
Sure, but how well does that work? Well enough for your requirements? How do you know? How do you measure it?

**BobGourley** @BobGourley@defcon.social · Jan 29, 2023, 14:01

**BobGourley** @BobGourley@defcon.social · Jan 29, 2023, 14:01

Jan 29, 2023, 14:01

BobGourley @BobGourley@defcon.social

@ceoln Seems to work pretty well, especially compared to not having that in at all. I imagine a user could work hard to get a childish result that includes some bad language by inputing something bad, but this cuts out many chances to do that. To see it in action and help me with some testing see: https://unrestrictedintelligence.com

**ceoln** @ceoln@qoto.org · 2023-01-29T17:19:49Z

ceoln @ceoln@qoto.org

@BobGourley
Ah, interesting! I asked it one serious question, which it answered plausibly if very generically, and one silly one ("boxers or briefs?"), which it fielded very nicely. :)

I'm not particularly good myself at getting generative text AIs to venture beyond their intended bounds; I'm just always curious how well they work in practice, when we basically know nothing in detail about what's happening inside. Just a black box that seems to work for lots of specific test cases, we know not how. I'm very curious as to how that will play out in practice, in places where it matters.

Jan 29, 2023, 17:19 · · · ·