WorstPlans.com updates every Monday!

Your weekly source for terrible plans and ideas!

Tag: unicode

Programmers love this one weird trick to handle Unicode characters without any complexity! “Visual-literation” replaces the old-fashioned way of transliteration. Watch as linguists wail mournfully at the years they wasted trying to transliterate sounds between alphabets!

The issue:

Many computers are unable to handle letters that don’t fall into the set of Latin characters used by English.

Even though the Unicode standard has greatly improved multi-character-set accessibility, problems still arise:

  • A character might not exist in a chosen font. For example, “Egyptian Hieroglyph of a bird catching a fish” is probably not available in Comic Sans.
  • Systems may be unable to cope with characters that look exactly the same (“homoglyphs”: https://en.wikipedia.org/wiki/Homoglyph).
    • For example, “Latin A” and “Cyrillic A” look identical, but have different underlying Unicode codes.
    • So an email from “YOUR BANK.COM” might actually be from a different site, with an imposter letter “A” (https://en.wikipedia.org/wiki/IDN_homograph_attack).
    • (This is an issue in English as well, with 0 (zero) versus O (capital “o”) and “I / l / 1” (capital i, lower-case L, numeral 1).)
  • Systems may not allow certain letters for certain situations; for example, if your username is “Linear B ‘stone wheel’ + Mayan jaguar glyph,” it is extremely unlikely that you will have an easy time logging into your user account.

The current failure mode is usually to display a blank rectangle instead, which is unhelpful.

Proposal:

Instead, we can use a sophisticated image-recognition system to map each letter from every language onto one or more Latin characters (Fig. 1).

Usually, this is called transliteration (https://en.wikipedia.org/wiki/Transliteration). But in this case, rather than using the sound of a symbol to convert it, we are using the symbol’s visual appearance, so it’s more like “visual-literation.”

easy-vs-hard

Fig. 1: With a limited character set, it may be easy to display the “Å” as  “A”, or “ñ” as “n.” But it’s unclear what should be done with the Chinese character at the bottom, which isn’t similar to any specific Latin letter.

more-abstract

Fig. 2:

Top: Image analysis reveals that the Chinese character (meaning “is”) can be most closely matched to the Latin capital “I.” Bottom: The Greek capital “∏” (pi) is disassembled into two Ts.

Some letters actually do somewhat resemble their Latin-ized versions (like “∏” as “TT”). However, some mappings are slightly less immediately obvious (Fig 3).

highly-unrelated

Fig. 3: Many complex symbols can—with a great degree of squinting—be matched to multi-letter strings.

Conclusion:

Linguists will love this idea, which forever solves the problem of representing multiple character sets using only the very limited Latin letters.

PROS: Gives every word in every language an unambiguous mapping to a set of (26*2) = 52 Latin letters.

CONS: Many symbols may map to the same end result (for example, “I” could be the English word “I,” or it could have been a “visual-literated” version of ““).

 

letter-translation

Fig. 4: A collection of potential mappings from various symbols to an ASCII equivalent. Finally, the days of complex transliteration are over!

 

 

Become vexed that you are unable to find a red-headed emoji face!

Background: An emoji overview:

hair

Fig 1: Current Apple emoji skin tones. Available tones vary by emoji font designer (e.g., Google, Microsoft).

Emoji people were originally only available with a light skin tone. Recently, more skin tone options have been added (Fig 1).

However, they are just a recoloring of the original emoji, and thus may not have realistic hair options. For example, the only women’s emoji hair option (as seen above) is “long and straight” and the only men’s hair option is “short and generally indistinct.”

Below, we will propose a method for easily allowing custom colors by using a phone camera, but first let us examine the present emoji situation.

The current state of the art:

family

Fig 2: Emoji families (or possibly “emoji movie theater with low seats, and two children in a row in front of two adults”) currently only exist in this one shade.

santa

Fig 3: unlike an emoji family, Emoji Santa Claus may have varying skin tone.

catFig 4: emoji cats can have multiple facial expressions. The emoji cat is unique among non-human animals in having a wide range of facial expressions.

snakeFig 5: Unlike the cat, the emoji snake has no ability to express emotion. Font limitations may make infinite combinations of facial expressions / skin (or scale) colors impractical, so less popular options (“coral snake that is crying while listening to music on 70s headphones”) are not currently available.

fish

Fig 6: The emoji fish exists in six variants—pufferfish, yellow fish, blue fish, dolphin, lungfish (cartoon), and lungfish (realistic).

Proposal:

Instead of selecting from a list, a user could set an emoji skin / fur / scale tone using the built-in camera in their phone (Fig 7).

emoji_photo

Fig 7: With the cameraphone in their left hand, this tomato-colored user is taking a picture of their right hand for use in the auto-emoji-coloring algorithm. Now the emoji people on this phone will have a tomato option.

emoji_color_animals

Fig 8: Now that we’ve decoupled eye color, hairstyle, hair color, and skin color, it is possible to make any combination of features. These new features can be applied to all animal emoji as well.If you want your cat emoji to be colored the same as your actual cat, you could take a picture of your cat instead of your hand. Perhaps you could even make the “car emoji” the same make and model of your actual car!

Conclusion:

It was apparently possible to add the flags of every country in the world, plus Antarctica (ant), so clearly space is not extremely limited. Perhaps Blue Emoji Cat With Red Whiskers really will be added in a future Unicode update.

PROS: Opens up a new world of hilariously colored animal emoji. Increases employment for font designers and font-related programmers.

CONS: Opens up a new world of font-related bugs. Assumes you’re willing to have a 250 megabyte font of “all combinations of human and animal skin / scale / fur / feather tones, hairstyles, hair colors, and eye colors” in memory on your phone at all times.