Psychology Magazine

Think AI Can Perceive Emotion? Think Again.

By Deric Bownds @DericBownds

Numerous MindBlog posts have presented the work and writing of Elizabeth Feldman Barrett (enter Barrett in the search box in the right column of this web page). Her book, "How Emotions Are Made," is the one I recommend when anyone asks me what I think is the best popular book on how our brains work.  Here I want to pass on her piece on AI and emotions in the Sat. May 18 Wall Street Journal. It collects together the various reasons that AI can not, and should not, be used for detecting our emotional state from our facial expressions or other body language.  Here is her text: 

Imagine that you are interviewing for a job. The interviewer asks a question that makes you think. While concentrating, you furrow your brow and your face forms a scowl. A camera in the room feeds your scowling face to an AI model, which determines that you’ve become angry. The interview team decides not to hire you because, in their view, you are too quick to anger. Well, if you weren’t angry during the interview, you probably would be now.

This scenario is less hypothetical than you might realize. So-called emotion AI systems already exist, and some are specifically designed for job interviews. Other emotion AI products try to create more empathic chatbots, build more precise medical treatment plans and detect confused students in classrooms. But there’s a catch: The best available scientific evidence indicates that there are no universal expressions of emotion.

In real life, angry people don’t commonly scowl. Studies show that in Western cultures, they scowl about 35% of the time, which is more than chance but not enough to be a universal expression of anger. The other 65% of the time, they move their faces in other meaningful ways. They might pout or frown. They might cry. They might laugh. They might sit quietly and plot their enemy’s demise. Even when Westerners do scowl, half the time it isn’t in anger. They scowl when they concentrate, when they enjoy a bad pun or when they have gas.

Similar findings hold true for every so-called universal facial expression of emotion. Frowning in sadness, smiling in happiness, widening your eyes in fear, wrinkling your nose in disgust and yes, scowling in anger, are stereotypes—common but oversimplified notions about emotional expressions.

Where did these stereotypes come from? You may be surprised to learn that they were not discovered by observing how people move their faces during episodes of emotion in real life. They originated in a book by Charles Darwin, “The Expression of the Emotions in Man and Animals,” which proposed that humans evolved certain facial movements from ancient animals. But Darwin didn’t conduct careful observations for these ideas as he had for his masterwork, “On the Origin of Species.” Instead, he came up with them by studying photographs of people whose faces were stimulated with electricity, then asked his colleagues if they agreed.

In 2019, the journal Psycho--logical Science in the Public Interest engaged five senior scientists, including me, to examine the scientific evidence for the idea that people express anger, sadness, fear, happiness, disgust and surprise in universal ways. We came from different fields—psychology, neuroscience, engineering and computer science—and began with opposing views. Yet, after reviewing more than a thousand papers during almost a hundred videoconferences, we reached a consensus: In the real world, an emotion like anger or sadness is a broad category full of variety. People express different emotions with the same facial movements and the same emotion with different facial movements. The variation is meaningfully tied to a person’s situation.

In short, we can’t train AI on stereotypes and expect the results to work in real life, no matter how big the data set or sophisticated the algorithm. Shortly after the paper was published, Microsoft retired the emotion AI features of their facial recognition software.

Other scientists have also demonstrated that faces are a poor indicator of a person’s emotional state. In a study published in the journal Psychological Science in 2008, scientists combined photographs of stereotypical but mismatched facial expressions and body poses, such as a scowling face attached to a body that’s holding a dirty diaper. Viewers asked to identify the emotion in each image typically chose what was implied by the body, not the face— in this case disgust, not anger. In a study published in the journal Science in 2012, the same lead scientist showed that winning and losing athletes, in the midst of their glory or defeat, make facial movements that are indistinguishable.

Nevertheless, these stereotypes are still widely assumed to be universal expressions of emotion. They’re in posters in U.S. preschools, spread through the media, designed into emojis and now enshrined in AI code. I recently asked two popular AIbased image generators, Midjourney and OpenAI’s DALL-E, to depict “an angry person.” I also asked two AI chatbots, OpenAI’s ChatGPT and Google’s Gemini, how to tell if a person is angry. The results were filled with scowls, furrowed brows, tense jaws and clenched teeth.

Even AI systems that appear to sidestep emotion stereotypes may still apply them in stealth. A 2021 study in the journal Nature trained an AI model with thousands of video clips from the internet and tested it on millions more. The authors concluded that 16 facial expressions are made worldwide in certain social contexts. Yet the trainers who labeled the clips with emotion words were all English--speakers from a single country, India, so they effectively transmitted cultural stereotypes to a machine. Plus, there was no way to objectively confirm what the strangers in the videos were actually feeling at the time.

Clearly, large data sets alone cannot protect an AI system from applying preconceived assumptions about emotion. The European Union’s AI Act, passed in 2023, recognizes this reality by barring the use of emotion AI in policing, schools and workplaces.

So what is the path forward? If you encounter an emotion AI --product that purports to hire skilled job candidates, diagnose anxiety and depression, assess guilt or innocence in court, detect terrorists in airports or analyze a person’s emotional state for any other purpose, it pays to be skeptical. Here are three questions you can ask about any emotion AI product to probe the scientific approach behind it.

Is the AI model trained to account for the huge variation of real-world emotional life? Any individual may express an emotion like anger differently at different times and in different situations, depending on context. People also use the same movements to express different states, even nonemotional ones. AI models must be trained to reflect this variety.

Does the AI model distinguish between observing facial movements and inferring meaning from these movements? Muscle movements are measurable; inferences are guesses. If a system or its designers confuse description with inference, like considering a scowl to be an “anger expression” or even calling a facial movement a “facial expression,” that’s a red flag.

Given that faces by themselves don’t reveal emotion, does the AI model include abundant context? I don’t mean just a couple of signals, such as a person’s voice and heart rate. In real life, when you perceive someone else as emotional, your brain combines signals from your eyes, ears, nose, mouth, skin, and the internal systems of your body and draws on a lifetime of experience. An AI model would need much more of this information to make reasonable guesses about a person’s emotional state.

AI promises to simplify decisions by providing quick answers, but these answers are helpful and justified only if they draw from the true richness and variety of experience. None of us wants important outcomes in our lives, or the lives of our loved ones, to be determined by a stereotype.

Back to Featured Articles on Logo Paperblog

Magazines