Voice Deepfakes: The Next Frontier in Cybersecurity

Deepfake-Speech-Detection-AI | Behavioral Signals

12 Jun

Voice Deepfakes: The Next Frontier in Cybersecurity

by Popi Paraschaki

in Opinions

Image by jacqueline macou from Pixabay

Amidst the fervor surrounding new applications for artificial intelligence are the very real implications of that technology and what it can do. What was once science fiction is quickly becoming reality, and the cybersecurity implications of technology able to imitate and emulate the faces and now voices of anyone with sufficient data to draw from are vast.

Deepfakes represent a rapid acceleration of a century-old concern—if we can truly believe what we see and hear when we are not there to see and hear it. Photo manipulation, fake audio recordings, and the risk of false video have been around since those technologies first arrived, but only now has the technology developed to the point that AI can construct fake images and voices for existing people with the right reference material, and even more concerning, these deepfakes are becoming more difficult to spot.

From a wave of deepfake videos featuring MrBeast, Tom Hanks, Oprah, and Elon Musk last fall to the recent surge in deepfake pornography being distributed of famous people like Taylor Swift, it’s becoming increasingly possible to make fake videos of real people with AI. While these videos are not perfect, and they can be quickly debunked by the people featured in them, the prevalence of the technology and the use of voice deepfakes to streamline the process are making it more difficult to keep up.

The Complicated Nature of Voice Deepfakes

Dozens of deepfake tools are freely available on GitHub, and the tools are spreading rapidly. Low-cost resources are available to commission deepfake videos, and voice cloning can be setup in minutes. Moreover, many companies lack the technology to detect fake videos and audio clips, a shortcoming that is coming into full profile as we enter an election year.

In one of the most prominent examples of voice deepfakes so far, a series of robocalls imitating the voice of President Joe Biden were made in New Hampshire. The calls instructed voters not to cast their ballots, directly interfering with the presidential primary.

While many headlines focus on video deepfakes, which look and feel more insidious, voice deepfakes are easier to produce, require less data input, and can be filtered to be intrinsically more disruptive. Bots and support services offering voice deepfakes start at just $10 per month and have reached a level of sophistication that is directly impacting consumers, voters, and businesses.

While the highest quality voice cloning is not an overnight process, requiring extensive training and calibration to match an existing voice, many tools are streamlining the process, and in some applications, a “close enough” clone is enough to fool victims. As the computational power needed to clone a voice is reduced and the algorithms become more sophisticated, it becomes easier to clone voices to a higher fidelity in less time and with less data.

The Security Implication of Deepfakes

Voice deepfakes offer a number of security challenges first and foremost among them being identity theft. While many institutional systems use personal identifying information to gate access to things like financial accounts, thieves have long looked for ways around such blockades. Hacking a password isn’t necessary if you can get the account owner to give away things like security questions freely. Social engineering and phishing campaigns account for 22% of data breaches, and 91% of all cyberattacks start with a phishing email. A recent NYT profile demonstrated how scammers are using voice deepfakes to attempt to scam banks, specifically credit card call centers. The author cites the infamous 60 Minutes episode in 2023 in which a consultant was able to clone the anchor’s voice and get an employee to provide him with their passport number in just a few minutes.

The high success rate of social engineering attacks has led many companies to offer training on how to handle emails requesting personal information. But it becomes more difficult when those emails turn into phone calls or voicemails using the voice of someone familiar—a boss, coworker, or family member who forgot their password and just needs to get in real quick.

Beyond identity theft, there is the risk that comes from false advertising. Deepfakes are already being used to emulate celebrities like Elon Musk, Joe Rogan, Oprah Winfrey, and many others. While the FTC has clear guidelines about the legality of false endorsements and recently announced new AI impersonation guidelines, scammers are accelerating their use of deepfake technology. Hearing a famous voice on a podcast, in a robocall, or in the background of a video can have a profound impact on building trust in what might otherwise be a scam, such as the growing number of deepfake background voices being used on YouTube to add credibility to questionable products.

Finally, there is the socio-political impact of voice deepfakes. More than a simple security risk, these deepfakes threaten democratic processes and the integrity of elections. From Ron DeSantis’ campaign using deepfake images of Donald Trump last summer to the AI voice clone of Joe Biden used in New Hampshire in January, Pandora’s box is open. As a result, the FEC is considering policies to limit or explicitly block the use of generative AI, such as deepfakes in political ads. Additionally, 32 states have put forth bills to regulate deepfakes in politics, and Meta and Google have both updated their AI deepfake policies for political ads, but none of this will stop bad actors from using the technology as long as it remains difficult to detect.

Detecting and Responding to Voice Deepfakes

In the United States, there are no federal laws regulating the use of deepfake technology. Despite proposed bills such as the No Artificial Intelligence Fake Replicas And Unauthorized Duplications (No AI FRAUD) Act, which would make it illegal to create digital depictions of a person (alive or dead) without consent, there has been little movement on these issues. Eleven states have passed some form of legislation related to deepfakes, with California issuing a ban on deepfakes in election season and allowing victims of deepfakes to sue.

While most of the world is still just responding to the impact of deepfake technology, the EU recently passed the AI Act, which establishes several requirements for the use of the technology. While legislation is likely to continue to be implemented, much of it focuses on the impact of deepfakes on the victims of the technology. This is because it is inherently difficult to detect and mitigate deepfakes proactively. The law essentially accepts that it will occur and looks for how to protect those who are impacted.

To truly respond to deepfakes, we need AI technology that proactively tracks, detects, and mitigates the threat. Behavioral Signals has developed patent-protected technology to analyze key signatures in the human voice that allow for deepfake voice detection. The advanced AI detection technology built into Behavioral Signals supports more robust voice authentication, detecting variances from natural speaking voices and layers that might indicate generative AI use. Systems like this will become integral in ensuring greater security and protecting against voice-based fraud at scale.

Preparing for What Comes Next

Voice deepfake technology will only become more advanced, less expensive, and easier to utilize. Legislation to protect its victims is crucial, but so too is the technology needed to ensure the tools can be used safely and effectively. Similar to the advent of the commercial Internet, safeguards and protections are needed to maximize the benefits for its users, reduce the risk of fraud and direct harm, and allow for a more robust use of developing technologies in ways that will benefit society.

Custom AI Solutions vs. Ready-to-Use AI. What to Choose for Your Business?

Artificial Intelligence (AI) is no longer a luxury; it’s a necessity for businesses aiming to stay competitive in today’s data-driven world. Yet, a critical......

December 6, 2024
The Hidden Power of Emotion in TED Talks: What Makes a Speech Go Viral?

Based on the paper Emotion-Aware Speech Popularity Prediction: A Use-Case on Ted Talks, presented at the 12th International Conference on Affective Computing and Intelligent Interaction......

November 7, 2024
Leveraging Advanced AI for Customer Satisfaction: A New Era in Call Center Interactions

For decades, the call center industry has struggled to address issues like low customer satisfaction and high turnover among agents. As simple as it......

July 15, 2024
Advanced AI and the Art of Building Human Rapport in Call Centers

The heart of customer service is the call center. Here, agents directly engage with customers to address questions, problems, and concerns related to a......

June 28, 2024
Voice Deepfakes: The Next Frontier in Cybersecurity

Image by jacqueline macou from Pixabay Amidst the fervor surrounding new applications for artificial intelligence are the very real implications of that technology and......

June 12, 2024
Top AI & ML Conferences You Can’t Miss, 2023 – 2024

Photo by Jaime Lopes on Unsplash The clock’s ticking down on another year, but hey, it’s not over yet! If you’re looking to give......

November 2, 2023
AI Consciousness & New Models: Society’s Evolving Fabric

Fireside Chat with Joscha Bach & Rana Gujral – LEAP 2023 DeepFest was co-located with LEAP from 6-9 February 2023 at Riyadh Expo Centre.......

July 17, 2023
Artificial Intelligence & Call Quality Monitoring

Customer service technology has been largely the same for decades. The means by which customer service agents are measured, calls are analyzed, and data......

November 18, 2021

Always Active

Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function properly without these cookies.

Name	Domain	Purpose	Expiry	Type
wpl_user_preference	behavioralsignals.com	WP GDPR Cookie Consent Preferences	1 year	HTTP
intercom-id-uha92v7r	behavioralsignals.com	---	9 months	---
intercom-session-uha92v7r	behavioralsignals.com	---	7 days	---
YSC	youtube.com	YouTube session cookie.	Session	HTTP

Marketing

Marketing cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.

Name	Domain	Purpose	Expiry	Type
VISITOR_INFO1_LIVE	youtube.com	YouTube cookie.	Session	HTTP
GPS	youtube.com	Google advertising domain	Session	HTTP
__gads	wsj.com	Google advertising cookie set on the websites domain (unlike the other Google advertising cookies that are set on doubleclick.net domain). According to Google the cookie serves purposes such as measuring interactions with the ads on that domain and preventing the same ads from being shown to you too many times.	2 years	HTTP
TapAd_TS	tapad.com	TapAd advertising tracking cookie.	1 month	HTTP
TapAd_DID	tapad.com	TapAd advertising tracking cookie.	1 month	HTTP

Analytics

Analytics cookies help website owners to understand how visitors interact with websites by collecting and reporting information anonymously.

Name	Domain	Purpose	Expiry	Type
_ga	behavioralsignals.com	Google Universal Analytics long-time unique user tracking identifier.	2 years	HTTP
_gid	behavioralsignals.com	Google Universal Analytics short-time unique user tracking identifier.	1 day	HTTP
_gat	behavioralsignals.com	Google Analytics tracking cookie.	Session	HTTP
IDE	doubleclick.net	Google advertising cookie used for user tracking and ad targeting purposes.	1 day	HTTP
vuid	vimeo.com	Vimeo tracking cookie	2 years	HTTP
uuid	live.streamtheworld.com	MediaMath tracking cookie.	1 year	HTTP

Preferences

Preference cookies enable a website to remember information that changes the way the website behaves or looks, like your preferred language or the region that you are in.

Name	Domain	Purpose	Expiry	Type
demdex	demdex.net	Adobe Audience Manager sets this tracking cookie to assign a unique ID to a site visitor. The demdex cookie helps Audience Manager perform basic functions such as visitor identification, ID synchronization, segmentation, modeling, reporting, etc.	6 months	HTTP

Unclassified

Unclassified cookies are cookies that we are in the process of classifying, together with the providers of individual cookies.

Name	Domain	Purpose	Expiry	Type
nabParticipation	behavioralsignals.com	---	4 months	---
nabExperimentsWithPageViews	behavioralsignals.com	---	4 months	---
AMCVS_CB68E4BA55144CAA0A4C98A5%40AdobeOrg	wsj.com	---	51 years	---
AMCV_CB68E4BA55144CAA0A4C98A5%40AdobeOrg	wsj.com	---	2 years	---
ub	wsod.com	---	2 months	---
f41	wsod.com	---	1 month	---
g61514	wsod.com	---	1 month	---
GED_PLAYLIST_ACTIVITY	video-api.wsj.com	---	51 years	---
_gat_customer	acast.com	---	Session	---
sc_anonymous_id	soundcloud.com	---	10 years	---
uuid-s	live.streamtheworld.com	---	1 month	---
tsUserData	targetspot.com	---	3 months	---
X-AB	sc-static.net	---	1 days	---
sp_t	spotify.com	---	2 months	---
sp_landing	open.spotify.com	---	1 days	---
_pin_unauth	open.spotify.com	---	1 year	---
_gat_gtag_UA_5784146_31	spotify.com	---	Session	---
personalization_id	twitter.com	---	2 years	---
sc_at	snapchat.com	---	1 year	---
TapAd_3WAY_SYNCS	tapad.com	---	2 months	---

Our privacy policy has been updated. You may find the updated policy here: https://behavioralsignals.com/privacy-policy/

Got it!

X