So OpenAI just released a detector of AI-generated text, I assume because of concerns in education / homework.

openai.com/blog/new-ai-classif

Maybe this is good?

No, it's very bad.

They claim 26% true positives, 9% false positives. Assume 10% of submitted homework is chatgpt generated, you get the classic counterintuitive outcome of poor predictive power: if a homework is flagged, there's a 3:1 chance it's *human* generated.

This is going to cause a lot of harm. It should be immediately recalled.

@ben

How does plagiarism detection software work normally?

@zleap @ben normally it matches literal strings. e.g. “these two sentences came from this source”

@shriramk @zleap @ben looks like MOSS uses fingerprinting which is a computationally efficient way to find white space invariant string matches.

yangdanny97.github.io/blog/201

Follow

@Cmastication @shriramk @ben

Sounds interesting, so from this I would guess it would look at a string such as

The cat sat on the mat., or The quick brown fox jumps over the lazy dog. From these generate a hash, is this similar to how say md5sum works, as I could write one of the above in a text file, save and generate a md5sum from that, this would be unqie, if you want back in and changed a lower case letter to upper case, or added a comma, it would change the file, and the md5sum would be different. We could then compare the two checksums to see if they match or don't match.

Or am I completely off track here. Given I am not remotely an expert in this.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.