QC

Who uses the 🫰 emoji for “money”? I only know its reference to “love” used in K-pop culture.

#emoji #unicode

Aaron “#e14n pro” Madlon-Kay

Happy #Apple ecosystem update day to all who celebrate 🎉

Newly covered #Unicode code points in iOS 18.4:

ౝ౷಄ೝഀഁഄ഻഼൏ൔൕൖ൘൙൚൛൜൝൞ൟ൶൷൸ඁ෦෧෨෩෪෫෬෭෮෯ꣾꣿ𑇡𑇢𑇣𑇤𑇥𑇦𑇧𑇨𑇩𑇪𑇫𑇬𑇭𑇮𑇯𑇰𑇱𑇲𑇳𑇴🪉🪏🪾🫆🫜🫟🫩

Mar 31, 2025, 22:50 · · · 0 · 0
Arne Babenhauserheide

Unicode pictograms to mark progressively rising values:
draketo.de/anderes/unicode-ico

For Ace Maths I searched for unicode icons to mark progression: show that you’re getting better. Since I found a lot of different options (from simple sparklines to a huge list of animals), I’m collecting them in this article.

Includes a tip for somewhat better, automatic #unicode "image" suppport #LaTeX with plain #pdflatex / #pdftex

Mar 29, 2025, 17:39 · · · 1 · 0
Kevin M. Vuilleumier 🇨🇭

Vous avez un peu d'expérience avec les IDN (Internationalized Domain Names), soit des noms de domaine contenant d'autres caractères que ceux ASCII ?

Je m'interroge surtout quant à leur compatibilité par rapport aux outils et aux logiciels. Les navigateurs, OK, ils les supportent tous, je n'ai pas à m'en faire.

Pour les API et autres outils, par contre...

#dns #idn #punycode #unicode

Terence Eden

Which is your favourite #Unicode telephone?

butterflyofChick ⏚ꝃ⌁⁂

And this selection "is not accurate". In other words : I mean, it is false 100%.

#Unicode #Kabyle #CLDR

Mar 25, 2025, 22:40 · · · 0 · 0
butterflyofChick ⏚ꝃ⌁⁂

So, the correct [full wide] names of the days of the week in Kabyle language are :

Sunday, Acer
Monday, Arim
Tuesday, Aram
Wednesday, Ahad
Thursday, Amhad
Friday, Sem
Saturday, Sed

#Kabyle #Unicode #CLDR

Mar 25, 2025, 22:34 · · · 0 · 0
butterflyofChick ⏚ꝃ⌁⁂

Ticket numéro 18455 ouvert sur le Jira d'Unicode à propos des noms des jours de la semaine en Kabyle.

Lien : unicode-org.atlassian.net/brow

#Kabyle #Unicode #CLDR

Mar 25, 2025, 22:28 · · · 0 · 0
Alain MICHEL 🤓

Faux gras, caractères fantaisistes, abus d’émojis : le détournement des caractères Unicode, fléau pour l’#accessibilité du #Web

On observe actuellement une mode de rédaction de messages publiés sur les réseaux sociaux donnant l'impression que le texte bénéficie d'une mise en forme particulière (gras, italique, souligné, script, etc.) grâce à certains caractères #Unicode.

Une véritable plaie pour les personnes utilisant des lecteurs d'écran !

par @lalutineduweb

lalutineduweb.fr/detournement-

Dr. Fortyseven 🥃 █▓▒░

It's kinda funny how one person's clever hack (#Unicode ZWJ sequences) can lead to anger and misinformation down the line.

Here, for whatever reason, ZWJ sequences were breaking over on #Threads and users immediately assumed anti-trans malice on the part of the platform.

It's easy to understand WHY someone would assume that in the current climate, but we've got to be careful to rule out other explanations before throwing confident accusations around. (This goes for me, too, btw.)

EDIT: Apparently it was fixed for them at some point. No doubt responded with something along the lines of "they heard our anger and feared us so they restored it!" or some such. ;)

Mar 23, 2025, 19:10 · · · 1 · 0
argv minus one

#Unicode is one of those little things in life that I can't help but smile about.

Is it perfect? No, of course not. Is it better than the alternative? Yes, so much so that every time I'm confronted with a long list of character encodings I can choose from, I feel a sense of relief when I find #UTF8 among them.

I wouldn't have thought it possible to standardize a single character encoding for everyone, and yet, somehow, there is just such a standard.

#programming

HoldMyType

UTF-8 has become the de facto standard for representing text. Emacs closely follows the #unicode standard, but uses an extended version of UTF-8 which enables support for raw bytes. Let me explain.

One of the reasons that UTF-8 is so useful is because ASCII characters are automatically valid. These are the values between 0 and 127 and includes the English alphabet. If you assigned a code point to every value of the byte you could only have 256 possible characters. Instead, bigger code points are encoded using multiple bytes. The values above 127 are reserved for leading bytes in UTF-8. Thus a random value above the ASCII range may not be valid. However Emacs extends unicode to reserve the code points 0x3FFF80 to 0x3FFFFF as “raw bytes between 128 and 255”.

The advantage of this is that Emacs can distinguish a “normal” byte that just happens to be valid UTF-8 from a “raw byte” that is not intended to be valid. However the display representation can be a confused with unprintable characters. For example, if you see this printed representation in the buffer:

\201

it can either be the unicode codepoint 0x81 (Emacs displays things in octal) or the raw byte 0x81 represented by codepoint 0x3FFF81. The only way to tell the difference is to inspect the character.

There are other use-cases for a “mostly UTF-8 but not quite” type of formats. For example, WTF-8 is used to handle invalid UTF-16 conversions to UTF-8. The downside of these formats is that you lose compliance with the spec, which means you can’t use third-party string libraries that operate on code points. The Remacs team had to rewrite the primitive string type in their project to support raw bytes.
coredumped.dev/2023/01/17/desi

Design of Emacs in Rust

This is the third post in my series about writing an…

coredumped.dev
Joop Kiefte (LaPingvino) 🟙

𝋀𝋁𝋂𝋃𝋄𝋅𝋆𝋇𝋈𝋉𝋊𝋋𝋌𝋍𝋎𝋏𝋐𝋑𝋓 #kaktovik #unicode

Bluesky

Bluesky Social
stateful being

so, yeah, the other thing i want it to do is pop up a waveform display (#braille #unicode, or maybe #sixel? #whynotboth) to let you cut album rips into tracks. gonna need it for the original content too, since some of that is only archived on #youtube by this point...

i already implemented that feature in #tek, now i just gotta copy it over... ugh, i need motivation, and concentration, and medication!

Mar 15, 2025, 17:28 · · · 0 · 0
Bok

Percent (sometimes 'per cent'): out of one hundred (%).

Permille (more often 'per mille'): out of one thousand (‰).

"Per ten thousand" has a symbol (‱), but no "per myriad" or "permyr" short name.

And "out of ten" lacks both a symbol and a "perdek" name.

It's gaps in #nomenclature (and #Unicode) like this that keep me up at night. #percentage

(Not to mention that the symbol with the two zeroes on bottom should logically mean "per 100", given the two noughts in that number. Ditto the one with one zero on bottom better meaning perdek.) #vocabulary

Baptiste Mispelon

Hey fedi, any #unicode #emoji connoisseurs out there who might know where I could get a high definition image (or vector) of an emoji?
I've got Noto installed locally but it seems to be bitmap based since it starts to pixelate once the font size gets big enough.

𝙹𝚘𝚑𝚊𝚗

#MastodonTools #GlitchSoc #glitch #Mastodon #JavaScript

#Лайфхак для тех, чьи инстансы поддерживают #Markdown-разметку: если нужно писать слова, обозначающие #тег​и, во множественном числе, можно перед окончанием вставлять пробел нулевой ширины. Его #UnicodeU+200B, а записывать в посте надо как ​ или мнемоничненько ​. Второй вариант предпочтительнее, потому что глупый слоновник даже внутри блока code попытается отыскать хештег (см. внизу) 😔

А если пишете из вебмордия — можно сделать себе #букмарклет на панельке браузера, который этот самый пробел будет копировать в буфер обмена:

javascript:navigator.clipboard.writeText('\u200B')

Mar 13, 2025, 15:12 · · · 0 · 0