Terminal Emulators Battle Royale – Unicode Edition!
It turns out that Unicode support in Terminals is a lot more difficult than it first appears. A quick overview of special support for Unicode characters in Terminals:
- "Wide" or "Fullwidth" characters, particularly for East Asian languages and emojis, are codepoints that occupy two cells in a terminal instead of one.
- "Zero" width combining characters used in languages such as Arabic, Hebrew, or Hindi do not occupy any cells themselves; instead, they modify the previous character.
- "Zero Width Joiner" (ZWJ U+200D) reduces and combines many codepoints into a single emoji. This is similar to combining, but encoded in a completely different way.
- "Variation Selector-16" (VS-16 U+FE0F) is a special character that, for specific "Narrow" emojis consuming one cell, causes them to become "Wide", consuming two cells.