Talk to digital assistants (like Siri, Alexa) using a new unique computer-only ”command language”! No more confusion!

Background:

When talking to a digital assistant (e.g. Siri, Alexa, Google Home, probably dozens of others), the user can generally speak in a normal human language.

For example: “Alexa, play ‘Cool Music for Cool People’.”

The Issue:

Unfortunately, thee ability to use natural language to communicate with these digital assistants is currently somewhat misleading: they only understand a restricted subset of language, and even slightly-complicated queries can befuddle the poor machine.

In the example above, the assistant might say “I don’t know that song” (wrong: it’s a playlist) or “Now playing your playlist on Apple Music” (wrong: it’s actually a playlist on Spotify) or “Now playing this other song with the word ‘cool’ in it” (wrong: it’s just guessing) or “Doing an Internet search for ‘Cool Music’” (wrong: this is just a useless fallback when the device has no idea what the user wants).

Additionally, in some languages—like English—some crucially-different words are similar in sound. For example, “turn the lights on” is occasionally confused for “turn the lights off,” due to the similarity of the first sound in the words “on” and “off” (Fig 1).

Fig. 1: One issue with using human language is the possibility of similar-sounding words confusing a device, as shown here. This sometimes happens if the user is far away, or the assistant has a magazine on top of it or something.

Proposal:

What we need is a special “command language” for Alexa/Siri. This would be like how someone riding a horse uses specific words like “whoa” and “gee” and “haw,” rather than saying “Horse, I would appreciate if you would stop as soon as is practicable.” 

If we pick words that are only used in this new made-up language, and not in any existing human languages, then the device won’t need a “wake word” (i.e. you won’t have to say “Siri” or “Alexa”), because it will know you are talking to it, and not to a human.

In Figure 2, we see a proposal of how this would greatly reduce the length of the command in Figure 1.

Fig. 2: By making a new word that means “turn the lights on” (LUXOG), we can shorten our interactions with this digital assistant and remove the earlier “on / off” ambiguity problem. In this case, the device responds with ZOM, which is the new computer-speak word for “OK.” The icons in the top left correspond to the rune for “LUXOG” and the rune for the English word “hall” (which you can see written in the bubble).

The symbols next to the bubbles indicate a non-phonetic runic inscription that can be used to make it even more clear that we’re talking to a machine. Anyone reading such hieroglyphs would know that they were intended as commands to a machine, which would be a benefit that is totally worth having to memorize hundreds of new symbols. (This will also create jobs in the Unicode consortium.)

In Figure 3, we see another example of where this command language vastly cuts down the amount of verbal interaction required with the computer.

Fig. 3: Here, TAOGMIN means “set timer for ___ minutes,” where the minutes then follow. Currently, the minutes are given in a human language (English, here), but perhaps a completely new numbering system could also be devised.

PROS: Should speed up interactions with digital assistants!

CONS: It might be difficult to ensure that this language is only used when talking to computers. If humans start adopting these words, then they won’t serve double-duty as a device “wake word,” and we’ll be right back where we started.