I just thought of a use of LLM/chatgpt:
q: What is " PM" a mojibake of?

GPT-4 O: "ΓÇ»PM" is likely a mojibake (garbled text resulting from text encoding issues) of "—PM". This typically happens when text encoded in UTF-8 is misinterpreted as Windows-1252 or a similar encoding. In UTF-8, the em dash (—) is represented by the byte sequence `E2 80 94`. If these bytes are incorrectly interpreted as Windows-1252, they are displayed as "ΓÇ»". Therefore, "ΓÇ»PM" should correctly be "—PM".

Follow

@eed3si9n cool idea! I would be worried about tokenization weirdness with this kind of thing, but don't know if it matters in practice

@spoltier it follows logic halfway, and jumping to weird conclusion? I couldn't get it to check the work by going the other way
mastodon.social/@eed3si9n/1124

@eed3si9n yeah, unless it has this exact case in the training set, I don't think you could get the right answer, unless you somehow get it to use code interpreter

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.