Chapter 8

Relying on AI to Improve Human Interaction

IN THIS CHAPTER

Bullet Communicating in new ways

Bullet Sharing ideas

Bullet Employing multimedia

Bullet Improving human sensory perception

People interact with each other in myriad ways. In fact, few people realize just how many different ways communication occurs. When many people think about communication, they think about writing or talking. However, interaction can take many other forms, including eye contact, tonal quality, and even scent, as described in “The Truth about Pheromones” in the Smithsonian Magazine. An example of the computer version of enhanced human interaction is the electronic nose, which relies on a combination of electronics, biochemistry, and artificial intelligence to perform its task and has been applied to a wide range of industrial applications and research (see https://tinyurl.com/488jfzut). In fact, the electronic nose can even sniff out diseases (see https://tinyurl.com/28cfcjek). This chapter concentrates more along the lines of standard communication, however, including body language. You get a better understanding of how AI can enhance human communication through means that are less costly than building your own electronic nose.

AI can also enhance the manner in which people exchange ideas. In some cases, AI provides entirely new methods of communication, but in many cases, AI provides a subtle (or sometimes not so subtle) method of enhancing existing ways to exchange ideas. Humans rely on exchanging ideas to create new technologies, build on existing technologies, or learn about technologies needed to increase an individual’s knowledge. Ideas are abstract, which makes exchanging them particularly difficult at times, so AI can provide a needed bridge between people.

At one time, if someone wanted to store their knowledge to share with someone else, they generally relied on writing. In some cases, they could also augment their communication by using graphics of various types. However, only some people can use these two forms of media to gain new knowledge; many people require more, which is why online sources such as YouTube have become so popular. Interestingly enough, you can augment the power of multimedia, which is already substantial, by using AI, and this chapter tells you how.

The final section of this chapter helps you understand how an AI can give you almost superhuman sensory perception. Perhaps you really want that electronic nose after all; it does provide significant advantages in detecting scents that are significantly less aromatic than humans can smell. Imagine being able to smell at the same level as a dog does (which uses 100 million aroma receptors, versus the 1 million aroma receptors that humans possess). It turns out there are two ways that let you achieve this goal: using monitors that a human accesses indirectly, and direct stimulation of human sensory perception.

Developing New Ways to Communicate

Communication involving a developed language initially took place between humans via the spoken versus written word. The only problem with spoken communication is that the two parties must appear near enough together to talk. Consequently, written communication is superior in many respects because it allows time-delayed communications that don’t require the two parties to ever see each other. The three main methods of human nonverbal communication rely on

· Alphabets/Iconographs: The abstraction of components of human words or symbols

· Language: The stringing of words or symbols together to create sentences or convey ideas in written form

· Body language: The augmentation of language with context

The first two methods are direct abstractions of the spoken word. They aren’t always easy to implement, but people have been doing so for thousands of years. The body-language component is the hardest to implement because you’re trying to create an abstraction of a physical process. Writing helps convey body language using specific terminology, such as that described at https://tinyurl.com/27newjbf. However, the written word falls short, so people augment it with symbols, such as emoticons and emojii (read about their differences at https://tinyurl.com/9jyc5n4n). The following sections discuss these issues in more detail.

Creating new alphabets

The introduction to this section discusses two new alphabets used in the computer age: emoticons and emojii (https://tinyurl.com/wjsw8te5 and https://emojipedia.org/). The sites where you find these two graphic alphabets online can list hundreds of them. For the most part, humans can interpret these iconic alphabets without too much trouble because they resemble facial expressions, but an application doesn’t have the human sense of art, so computers often require an AI just to figure out what emotion a human is trying to convey with the little pictures. Fortunately, you can find standardized lists, such as the Unicode emoji chart at https://tinyurl.com/j4bdmm3m. Of course, a standardized list doesn’t actually help with translation. The article “An Algortithm Trained on Emoji Knows When You’re Being Sarcastic on Twitter” at MIT Technology Review.com provides more details on how someone can train an AI to interpret and react to emojii (and by extension, emoticons). You can actually see an example of this process at work at https://deepmoji.mit.edu/.

The emoticon is an older technology, and many people are trying their best to forget it (but likely won’t succeed because emoticons are so easy to type). The emoji, however, is new and exciting enough to warrant a movie called The Emoji Movie. You can also rely on Google’s AI to turn your selfies into emojii (see “Google’s Allo Morphs Your Selfies into Custom Emojis” at CNET:). Many people have a hard time figuring some emojii out, so you can check Emojipedia (https://emojipedia.org/) to see what they mean.

Remember Humans have created new alphabets to meet specific needs since the beginning of the written word. Emoticons and emojii represent two of many alphabets that you can count on humans creating as the result of the Internet and the use of AI. In fact, it may actually require an AI to keep up with them all. However, it’s equally important to remember that some characters are lost as time progresses. For example, check out the article “12 Characters that Didn’t Make the Alphabet” at MentalFloss.com.

Working with emoji and other meaningful graphics

Many text references today are sprinkled with emojii and other iconographs. Most of the references you see online today deal with emojii and emoticons, either removing them or converting them to text, as explained in the article “Text Preprocessing: Handle Emoji and Emoticons” at Study Machine Learning.com.

It’s not too uncommon to find bits and pieces of other languages sprinkled throughout a text, and these words or phrases need to be handled in a meaningful way. That’s where articles like “Chinese Natural Language (Pre)processing: An Introduction” at Towards Data Science.com can come in very handy. The problem with translating some languages into a form where they can act as input to a Natural Language Processing (NLP) model is that the concept of the language differs from English. For example, when working with Chinese, you deal with ideas, rather than pronunciation as you do with English.

Some situations also require that you process meaningful graphics because part of the text meaning is in the graphic. This sort of translation need commonly arises in technical or medical texts. The article “How Image Analysis and Natural Language Processing can be combined to improve Precision Medicine” by Obi Igbokwe at Medium.com discusses how to accomplish this task.

Remember The point of these various translations of nontext into a textual form is that humans communicate in many ways, and AI can help make such communication easier and improve comprehension. In addition, using AI to perform NLP makes it possible to look for patterns, even in text that is heavily imbued with nontext elements.

Automating language translation

The world has always had a problem with the lack of a common language. Yes, English has become more or less universal — to some extent, but it’s still not completely universal. Having someone translate between languages can be expensive, cumbersome, and error prone, so translators, although necessary in many situations, aren’t necessarily a great answer either. For those who lack the assistance of a translator, dealing with other languages can be quite difficult, which is where applications such as Google Translate (see Figure 8-1) come into play.

One of the things you should note in Figure 8-1 is that Google Translate offers to automatically detect the language for you. What’s interesting about this feature is that it works extremely well in most cases. Part of the responsibility for this feature is the Google Neural Machine Translation (GNMT) system. It can actually look at entire sentences to make sense of them and provide better translations than applications that use phrases or words as the basis for creating a translation (see https://tinyurl.com/8by975xx for details).

Snapshot of Google Translate is an example of AI that performs an essential, everyday task.

FIGURE 8-1: Google Translate is an example of AI that performs an essential, everyday task.

Technicalstuff What is even more impressive is that GNMT can translate between languages even when it doesn’t have a specific translator, using an artificial language, an interlingua (see https://tinyurl.com/wcfk24jc). However, it’s important to realize that an interlingua doesn’t function as a universal translator; it’s more of a universal bridge. Say that the GNMT doesn’t know how to translate between Chinese and Spanish. However, it can translate between Chinese and English and between English and Spanish. By building a 3-D network representing these three languages (the interlingua), GNMT is able to create its own translation between Chinese and Spanish. Unfortunately, this system won’t work for translating between Chinese and Martian because no method is available yet to understand and translate Martian in any other human language. Humans still need to create a base translation for GNMT to do its work.

Incorporating body language

A significant part of human communication occurs with body language, which is why the use of emoticons and emojii is important. However, people are becoming more used to working directly with cameras to create videos and other forms of communication that involve no writing. In this case, a computer could possibly listen to human input, parse it into tokens representing the human speech, and then process those tokens to fulfill a request, similar to the manner in which Alexa or Google Home and their ilk work.

Remember Unfortunately, merely translating the spoken word into tokens won’t do the job because the whole issue of nonverbal communication remains. In this case, the AI must be able to read body language directly. The article “Computer Reads Body Language” from Carnegie Mellon University discusses some of the issues that developers must solve to make reading body language possible. The picture at the beginning of that article gives you some idea of how the computer camera must capture human positions to read the body language, and the AI often requires input from multiple cameras to make up for such issues as having part of the human anatomy obscured from the view of a single camera. The reading of body language involves interpreting these human characteristics:

· Posture

· Head motion

· Facial expression

· Eye contact

· Gestures

Of course, there are other characteristics, but even if an AI can get these five areas down, it can go a long way to providing a correct body-language interpretation. In addition to body language, current AI implementations also take characteristics like tonal quality into consideration, which makes for an extremely complex AI that still doesn’t come close to doing what the human brain does seemingly without effort.

Technicalstuff After an AI can read body language, it must also provide a means to output it when interacting with humans. Given that reading body language (facial expressions, body position, placement of hands, and so on) is in its infancy, robotic or graphic presentation of body language is even less developed. The article “Robots Learn to Speak Body Language” at IEEE Spectrum.org points out that robots can currently interpret body language and then react appropriately in some few cases. Robots are currently unable to create good facial expressions, so according to the article “Realistic Robot Faces Aren’t Enough” at The Conversation.com, the best-case scenario is to substitute posture, head motion, and gestures for body language. You can find yet more details in this ScienceDirect.com article, “Facially expressive humanoid robotic face.” The result isn’t all that impressive, yet.

Exchanging Ideas

An AI doesn’t have ideas because it lacks both intrapersonal intelligence and the ability to understand. However, an AI can enable humans to exchange ideas in a manner that creates a whole that is greater than the sum of its parts. In many cases, the AI isn’t performing any sort of exchange. The humans involved in the process perform the exchange by relying on the AI to augment the communication process. The following sections provide additional details about how this process occurs.

Creating connections

A human can exchange ideas with another human, but only as long as the two humans know about each other. The problem is that many experts in a particular field don’t actually know each other — at least, not well enough to communicate. An AI can perform research based on the flow of ideas that a human provides and then create connections with other humans who have that same (or similar) flow of ideas.

One of the ways in which this communication creation occurs is in social media sites such as LinkedIn (https://www.linkedin.com/), where the idea is to create connections between people based on a number of criteria. A person’s network becomes the means by which the AI deep inside LinkedIn suggests other potential connections. Ultimately, the purpose of these connections from the user’s perspective is to gain access to new human resources, make business contacts, create a sale, or perform other tasks that LinkedIn enables using the various connections.

Augmenting communication

To exchange ideas successfully, two humans need to communicate well. The only problem is that humans sometimes don’t communicate well, and sometimes they don’t communicate at all. The issue isn’t just one of translating words but also ideas. The societal and personal biases of individuals can preclude the communication because an idea for one group may not translate at all for another group. For example, the laws in one country could make someone think in one way, but the laws in another country could make the other human think in an entirely different manner.

Theoretically, an AI could help communication between disparate groups in numerous ways. Of course, language translation (assuming that the translation is accurate) is one of these methods. However, an AI could provide cues as to what is and isn’t culturally acceptable by prescreening materials. Using categorization, an AI could also suggest aids like alternative graphics and so on to help communication take place in a manner that helps both parties.

Defining trends

Humans often base ideas on trends. However, to visualize how the idea works, other parties in the exchange of ideas must also see those trends, and communicating using this sort of information is notoriously difficult. AI can perform various levels of data analysis and present the output graphically. The AI can analyze the data in more ways and faster than a human can so that the story the data tells is specifically the one you need it to tell. The data remains the same; the presentation and interpretation of the data change.

Studies show that humans relate better to graphical output than tabular output, and graphical output will definitely make trends easier to see. You generally use tabular data to present only specific information; graphics always work best for showing trends (see https://tinyurl.com/3hwsjwcy). Using AI-driven applications can also make creating the right sort of graphic output for a particular requirement easier. Not all humans see graphics in precisely the same way, so matching a graphic type to your audience is essential.

Using Multimedia

Most people learn by using multiple senses and multiple approaches. A doorway to learning that works for one person may leave another completely mystified. Consequently, the more ways in which a person can communicate concepts and ideas, the more likely it is that other people will understand what the person is trying to communicate. Multimedia normally consists of sound, graphics, text, and animation, but some multimedia does more.

AI can help with multimedia in numerous ways. One of the most important is in the creation, or authoring, of the multimedia. You find AI in applications that help with everything from media development to media presentation. For example, when translating the colors in an image, an AI may provide the benefit of helping you visualize the effects of those changes faster than trying one color combination at a time (the brute-force approach).

After using multimedia to present ideas in more than one form, those receiving the ideas must process the information. A secondary use of AI relies on the use of neural networks to process the information in various ways. Categorizing the multimedia is an essential use of the technology today. However, in the future you can look forward to using AI to help in 3-D reconstruction of scenes based on 2-D pictures. Imagine police being able to walk through a virtual crime scene with every detail faithfully captured.

MULTIMEDIA AND ADDRESSING PEOPLE’S FUNCTIONAL NEEDS

Most people have some particular functional need relating to how they take in and understand information. (If you’re wondering about the use of the new accessibility terms in this book, please check out the resource at https://ncdj.org/style-guide/.) Considering such needs as part of people’s use of multimedia is important. The whole intent of multimedia is to communicate ideas in as many ways as possible so that just about everyone can understand the ideas and concepts you want to present. Even when a presentation as a whole uses multimedia successfully, individual ideas can become lost when the presentation uses only one method to communicate them. For example, communicating a sound only aurally almost guarantees that only those with really good hearing will actually receive the idea. A subset of those with the required hearing still won’t get the idea because it may appear as only so much noise to them, or they simply don’t learn through the limited method offered in the presentation. Using as many methods as possible to communicate each idea is essential if you want to reach as many people as possible.

People used to speculate that various kinds of multimedia would appear in new forms. For example, imagine a newspaper that provides Harry Potter-like dynamic displays. Most of the technology pieces are actually available today, but the issue comes down to the market. For a technology to become successful, it must have a market — that is, a means for paying for itself.

Embellishing Human Sensory Perception

One way that AI truly excels at improving human interaction is by augmenting humans in one of two ways: by allowing them to use their native senses to work with augmented data, or by augmenting the native senses to do more. The following sections discuss both approaches to enhancing human sensing and therefore improve communication.

Shifting data spectrum

When performing various kinds of information gathering, humans often employ technologies that filter or shift the data spectrum with regard to color, sound, touch, or smell. The human still uses native capabilities, but some technology changes the input such that it works with that native capability. One of the most common examples of spectrum shifting is astronomy, in which shifting and filtering light enables people to see astronomical elements, such as nebula, in ways that the naked eye can’t —thereby improving our understanding of the universe.

Teaching a robot to feel by touch is in its infancy as described at https://tinyurl.com/ukjscsye and https://tinyurl.com/y998s2h8. Most efforts today focus on helping the robot work better by using tactile responses as part of manipulating its machinery, such as the light touch needed to grasp an egg versus the heavier touch required to lift a barbell. When this technology moves far enough forward, it might become possible for various AIs to communicate with humans through direct touch or the description of various kinds of touch.

Shifting and filtering colors, sounds, touches, and smells manually can require a great deal of time, and the results can disappoint even when performed expertly, which is where AI comes into play. An AI can try various combinations far faster than a human, and can locate the potentially useful combinations with greater ease because an AI performs the task in a consistent manner.

Technicalstuff The most intriguing technique for exploring our world, however, is completely different from what most people expect. What if you could smell a color or see a sound? The occurrence of synesthesia, which is the use of one sense to interpret input from a different sense, is well documented in humans (see https://tinyurl.com/2s9tb284; you can also find a range of related articles at https://tinyurl.com/a8y44n2p). People use AI to help study the effect, as described at https://tinyurl.com/ec6mpbpa. The interesting use of this technology, though, is to create a condition in which other people can actually use synesthesia as another means to see the world (see https://www.fastcompany.com/3024927/this-app-aids-your-decision-making-by-mimicking-its-creators-synesthesia).

Augmenting human senses

As an alternative to using an external application to shift data spectrum and somehow make that shifted data available for use by humans, you can augment human senses. In augmentation, a device, either external or implanted, enables a human to directly process sensory input in a new way. Many people view these new capabilities as the creation of cyborgs, as described at https://tinyurl.com/xyvxcxbm. The idea is an old one: Use tools to make humans ever more effective at performing an array of tasks. In this scenario, humans receive two forms of augmentation: physical and intellectual.

Physical augmentation of human senses already takes place in many ways, and it’s guaranteed to increase as humans become more receptive to various kinds of implants. For example, night vision glasses currently allow humans to see at night, with high-end models providing color vision controlled by a specially designed processor. In the future, eye augmentation/replacement could allow people to see any part of the spectrum. This augmentation/replacement would be controlled by the person’s thoughts, so that people would see only that part of the spectrum needed to perform a specific task.

Intelligence Augmentation requires more intrusive measures but also promises to allow humans to exercise far greater capabilities. Unlike AI, Intelligence Augmentation (IA) has a human actor at the center of the processing. The human provides the creativity and intent that AI currently lacks. You can read a discussion of the differences between AI and IA in “Intelligence Augmentation Vs. Artificial Intelligence” at the Planergy blog.

If you find an error or have any questions, please email us at admin@erenow.org. Thank you!