I'm worrying about #feedbackLoops and #generativeAI. #ChatGPT responses are all over the public web. People are already using it to help write journal articles. https://www.theverge.com/2023/1/5/23540291/chatgpt-ai-writing-tool-banned-writing-academic-icml-paper These will be sources of new training data for #LLMs.
What happens when LLMs train on the output of LLMs? Well, any tendencies and biases of LLMs will intensify. But we're not even in a position to anticipate the details. So, what to do?
#AIEthics 🧵 1/4
Good question.
You might know that ChatGPT itself is working on a digital watermarking project, based on pseudorandom choices on its output distributions – it's touted as an anti-propaganda, or anti-plagiarism tool, which actually doesn't make sense because of the question who has access to the key. However what you describe makes perfect sense: filtering of crawled corpora is actually a really good use case.
I think, there is a much bigger, as yet untapped (at least not yet publicized) data source which won't have that problem for a while: Google Books.
@boris_steipe Thanks for pointing me to the #ChatGPT watermarking plan!
That addresses one part of the challenge of keeping future training data unpolluted by AI-generated text: It allows exclusion by way of diction and punctuation.
But the deeper worry, I think, has to do with the content/meaning of the AI-generated text. If someone rephrased the ChatGPT output before publishing it, then that content would still be out there for future training, yielding #feedbackLoops.
You're welcome. You are right that modified text would evade the watermark - but the filtering doesn't have to be just the statistical distribution of the generation process ... for larger text you could filter according to the perplexity of the text itself. Or put differently: accept only text to the training data that actually has something new to say.
What a radical idea: we might even apply such a filter to human discourse. Wouldn't that be nice 🙂
"But if it can be classified, then it can be generated" .. Ah, yes - but that's not to say it is useful. Novelty is necessary, but not sufficient. The major breakthrough will come when the algorithms learn to evaluate the quality of their proposals in a generalized context. Keywords in this domain are "ranking" and "evaluation".