So OpenAI just released a detector of AI-generated text, I assume because of concerns in education / homework.
https://openai.com/blog/new-ai-classifier-for-indicating-ai-written-text/
Maybe this is good?
No, it's very bad.
They claim 26% true positives, 9% false positives. Assume 10% of submitted homework is chatgpt generated, you get the classic counterintuitive outcome of poor predictive power: if a homework is flagged, there's a 3:1 chance it's *human* generated.
This is going to cause a lot of harm. It should be immediately recalled.
How does plagiarism detection software work normally?
@Cmastication @zleap @ben That's not how MOSS, the most widely-used checker, works.
Sounds interesting, so from this I would guess it would look at a string such as
The cat sat on the mat., or The quick brown fox jumps over the lazy dog. From these generate a hash, is this similar to how say md5sum works, as I could write one of the above in a text file, save and generate a md5sum from that, this would be unqie, if you want back in and changed a lower case letter to upper case, or added a comma, it would change the file, and the md5sum would be different. We could then compare the two checksums to see if they match or don't match.
Or am I completely off track here. Given I am not remotely an expert in this.