In the old days it was be hard to program in things to watch for regarding user input, including huge numbers of bad words/phrases. In my GPT-3 based app I just coded in this command: If the request is lewd, tell user that the question is not a suitable topic for this tool.
@BobGourley
Sure, but how well does that work? Well enough for your requirements? How do you know? How do you measure it?
@BobGourley
Ah, interesting! I asked it one serious question, which it answered plausibly if very generically, and one silly one ("boxers or briefs?"), which it fielded very nicely. :)
I'm not particularly good myself at getting generative text AIs to venture beyond their intended bounds; I'm just always curious how well they work in practice, when we basically know nothing in detail about what's happening inside. Just a black box that seems to work for lots of specific test cases, we know not how. I'm very curious as to how that will play out in practice, in places where it matters.
@ceoln Thanks for kicking the tires on the site! Much appreciated.