Is there a page where you can paste a string with unknown-to-you Unicode characters and it explains them all?

Stuff like invisible typographical spaces, zero width stuff, emoji joiners, RTL, emoji variant selectors, math symbols used as “fonts,” control characters, etc.? Not just listing stuff, but also explaining them in friendly terms, and maybe adding a little history?

(Because I want to build one if that doesn’t exist. I can see it being useful for debugging, but also education.)

@typographische Oh, it seems broken now but that feels like a promising name!

@typographische Oh, it works now. Yeah. I think I want something string oriented and not as… nerdy.

@mwichary There used to be a short text for each glyph describing its origin and use. I have no idea whether this has since been removed or whether I just can't find it. However, these explanations are included in the book of the same name, which is over 650 pages long.

Follow

@typographische @mwichary one problem is that there are many characters that are different in intent, but indistinguishable to the untrained eye. These often get misused, most trivially during OCR. Some examples are e.g. the Latin and Cyrillic letter "e", also accents, such as "è" and "é". These are "English-friendly" examples. In Medieval manuscripts there's a huge variety of overhead characters that are now represented in unicode, but we (non-experts) have little sensitivity for what their meaning is/was.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.