Why Voice Reveals More About Your Emotions

26 Mar

Why Voice Reveals More About Your Emotions

Humans are emotionally complex. Not just in how we feel, but in how we convey those feelings. What you say and what you do communicates only a fraction of your emotions. Thousands of other cues in body language, facial expressions, and especially in voice can paint a vivid picture of how you feel at any particular moment.

This is an important area of emerging research, especially as AI systems are programmed to start recognizing human emotion and responding in kind. A big part of that is recognizing where and how humans can “hide” or skew their emotions. In particular, we now know that voice is much more difficult to fake than facial expressions, and in general, both are difficult to fake for long periods of time. It takes a conscious, concerted effort to change your voice and physical expressions, so they don’t match how you actually feel. When it comes to voice, many of these nuances are noticeable in conversation, however subtle they might be.

The Ability of Humans to Capture Subtle Nuance in Speech

The same innocuous phrase can be expressed and perceived in dozens of different ways. A spouse asking, “did you remember the milk?” could be a simple question without any emotion – a tick on a box in a mental to-do list – or it could be judgmental, voicing disapproval at perceived incompetence. Or anything in between.

Humans have the capability of capturing these types of small nuances in voice, and in many cases intentionally coding them in. The research around this is constantly developing. Until recently, facial expressions and body language were identified as the most important elements of non-linguistic communication – the subvocal cues that could define intent in a conversation.

But a recent study by Michael Kraus (among others) identifies our sense of hearing as being more acute at detecting emotion in a conversation. His study showed a higher degree of accuracy in identifying emotions, not just when hearing a voice vs seeing facial expressions, but also when compared to both hearing and seeing facial expressions. When isolated, a voice is loaded with information that the human ear is particularly good at understanding.

Consider some of the simple cues that can convey a spectrum of emotion when in a conversation. Quick breathing, clipped words and excessive pauses might point to anxiety or being upset. A slow, monotone voice or a quieter tone than normal could point to exhaustion or illness. Faster, slightly louder speech could point to the excitement.

It goes further than just dissecting and understanding the base emotional state of someone based on their speech, though. As discussed in a recent article from the Berkeley Greater Good Science Center, Research by Emiliana Simon-Thomas and Dacher Keltner showed that humans can capture small nuances in speech, delineating between sadness, angry, repulsion, and exhaustion for example. And many of these cues are language independent. People are able to determine emotional state even when they are not fluent in the language being spoken.

Empathy in an Increasingly Digital World

Until recently, technology had allowed us to largely abandon face-to-face conversations in our daily lives. People spend an average of more than two hours per day on social media, use email far more phone their phones, and have shifted even short quick conversations in-office to communal platforms like Slack.

When so much of how we interact is tied up in our connection through speech, what impact does this digital transition have on our ability to feel empathy and truly connect with one another? Research on this is still developing but based on what we are learning about the role of vocal cues to understand emotions and connect with one another on a very basic human level, a lot is getting lost in text messages and emails.

Emotions are driven by our speech and our actions, more so than we previously realized. A recent study by Jean-Julien Aucouturier at CNRS in France asked people to read and record a short, innocuous story. Their voices were then altered and when played back, many of them would feel different based on what they heard. If their voice was sped up and the pitch raised, they felt more excited. Slower with pauses added – they would feel a bit more tired or unsure of themselves. It’s an interesting experiment that highlights the deep emotional impact our voices have on us. So, when we don’t use our voices to communicate, it raises several questions about how effectively we are connecting with others.

What This Means for Voice Assistants and AI Technology

More than 1 billion smart devices now have some form of voice assistant – from dedicated voice-activated devices at home to smartphones and tablets with Siri or Google voice tools. These are the tip of the iceberg in how voice AI is being integrated into our lives, and a big part of how effective these tools will be is their ability to recognize emotion in the human voice.

The words in a conversation are the tip of an iceberg – only a small percentage of the conversation we are really having. That’s where Emotion AI can play an outsized role, bridging the gap between what can be perceived and what is just under the surface in a conversation. An AI system can be programmed to automatically gather and analyze information encoded in the human voice. And increasingly, this goes beyond objective behavioral content such as what someone said or what they were doing when they said it. Today’s Emotion AI can automate a host of subjective signals – detecting not just that someone is frustrated, but how frustrated they are on a spectrum drawn from millions of data points in conversations it has analyzed.

While humans detect these cues automatically, we are not always accurate in decoding their meaning. At the same time, we are notoriously bad at recognizing such signals in our own voices. With the use of Emotion AI, we can improve the performance of customer service and sales teams, build more responsive personal voice assistants and leverage unstructured data in new and exciting ways across all industry. This is all possible with the advances in technology around voice AI.

Name	Domain	Purpose	Expiry	Type
wpl_user_preference	behavioralsignals.com	WP GDPR Cookie Consent Preferences	1 year	HTTP
intercom-id-uha92v7r	behavioralsignals.com	---	9 months	---
intercom-session-uha92v7r	behavioralsignals.com	---	7 days	---
YSC	youtube.com	YouTube session cookie.	Session	HTTP

Name	Domain	Purpose	Expiry	Type
VISITOR_INFO1_LIVE	youtube.com	YouTube cookie.	Session	HTTP
GPS	youtube.com	Google advertising domain	Session	HTTP
__gads	wsj.com	Google advertising cookie set on the websites domain (unlike the other Google advertising cookies that are set on doubleclick.net domain). According to Google the cookie serves purposes such as measuring interactions with the ads on that domain and preventing the same ads from being shown to you too many times.	2 years	HTTP
TapAd_TS	tapad.com	TapAd advertising tracking cookie.	1 month	HTTP
TapAd_DID	tapad.com	TapAd advertising tracking cookie.	1 month	HTTP

Name	Domain	Purpose	Expiry	Type
_ga	behavioralsignals.com	Google Universal Analytics long-time unique user tracking identifier.	2 years	HTTP
_gid	behavioralsignals.com	Google Universal Analytics short-time unique user tracking identifier.	1 day	HTTP
_gat	behavioralsignals.com	Google Analytics tracking cookie.	Session	HTTP
IDE	doubleclick.net	Google advertising cookie used for user tracking and ad targeting purposes.	1 day	HTTP
vuid	vimeo.com	Vimeo tracking cookie	2 years	HTTP
uuid	live.streamtheworld.com	MediaMath tracking cookie.	1 year	HTTP

Name	Domain	Purpose	Expiry	Type
nabParticipation	behavioralsignals.com	---	4 months	---
nabExperimentsWithPageViews	behavioralsignals.com	---	4 months	---
AMCVS_CB68E4BA55144CAA0A4C98A5%40AdobeOrg	wsj.com	---	51 years	---
AMCV_CB68E4BA55144CAA0A4C98A5%40AdobeOrg	wsj.com	---	2 years	---
ub	wsod.com	---	2 months	---
f41	wsod.com	---	1 month	---
g61514	wsod.com	---	1 month	---
GED_PLAYLIST_ACTIVITY	video-api.wsj.com	---	51 years	---
_gat_customer	acast.com	---	Session	---
sc_anonymous_id	soundcloud.com	---	10 years	---
uuid-s	live.streamtheworld.com	---	1 month	---
tsUserData	targetspot.com	---	3 months	---
X-AB	sc-static.net	---	1 days	---
sp_t	spotify.com	---	2 months	---
sp_landing	open.spotify.com	---	1 days	---
_pin_unauth	open.spotify.com	---	1 year	---
_gat_gtag_UA_5784146_31	spotify.com	---	Session	---
personalization_id	twitter.com	---	2 years	---
sc_at	snapchat.com	---	1 year	---
TapAd_3WAY_SYNCS	tapad.com	---	2 months	---

Why Voice Reveals More About Your Emotions

The Ability of Humans to Capture Subtle Nuance in Speech

Empathy in an Increasingly Digital World

What This Means for Voice Assistants and AI Technology

How We’re Fighting Deepfakes – Live from Dubai

The Science Behind Voice Deepfakes: Biomarkers, Behaviors, and AI

Custom AI Solutions vs. Ready-to-Use AI. What to Choose for Your Business?

The Hidden Power of Emotion in TED Talks: What Makes a Speech Go Viral?

Leveraging Advanced AI for Customer Satisfaction: A New Era in Call Center Interactions

Advanced AI and the Art of Building Human Rapport in Call Centers

Voice Deepfakes: The Next Frontier in Cybersecurity

Top AI & ML Conferences You Can’t Miss, 2023 – 2024

Emotion and Behavioral AI technology via Voice

Legal Docs

We are social

Our partners

QUICK LINKS

Contact Us