This spring, Clive Kabatznik, a Florida investor, called his local Bank of America representative to discuss a large money transfer he was planning to make. He then called again. Except the second call wasn’t from Kabatznik. Instead, a computer program had artificially generated his voice and attempted to deceive the bank employee into transferring the money to another destination.
Kabatznik and his banker were the targets of a next-generation scam attempt that has caught the attention of cybersecurity experts: the use of artificial intelligence to generate deepfake voice impersonations of real people.
The problem is still so new that there isn’t a comprehensive count of how frequently it occurs. However, an expert from Pindrop, a company that monitors audio traffic for many of the top US banks, said they had seen an increase in the prevalence and sophistication of voice fraud attempts by scammers this year. Towards the end of last year, another major voice authentication provider, Nuance, experienced its first successful deepfake attack against a financial services client.
In Kabatznik’s case, the fraud was detectable. However, the speed of technological development, the decrease in costs of generative artificial intelligence programs, and the widespread availability of voice recordings of individuals on the internet have created the perfect conditions for voice-based AI-assisted scams.
Stolen customer data, such as compromised bank accounts available on underground markets, help fraudsters carry out these attacks. It becomes even easier with wealthy clients whose public appearances, including speeches, are often widely available on the internet. Finding audio samples of everyday customers can be as simple as searching online, for example, on social media apps like TikTok and Instagram, using the name of someone whose banking information the scammers already have.
“There’s a lot of audio content out there,” said Vijay Balasubramaniyan, CEO and founder of Pindrop, which reviews voice verification systems for eight of the top 10 US financial institutions.
In the last decade, Pindrop has reviewed recordings of over 5 billion calls received by the call centers of the financial companies they serve. These centers handle products like bank accounts, credit cards, and other services offered by major retail banks. All call centers receive calls from scammers, typically between 1,000 and 10,000 a year. It’s common to receive 20 scam calls per week, according to Balasubramaniyan.
So far, computer-generated fake voices only represent a “handful” of these calls, according to Balasubramaniyan, and they only started occurring last year.
Most of the fake voice attacks Pindrop has seen have taken place in credit card call centers, where human representatives assist customers needing help with their cards.
Balasubramaniyan showed an anonymous recording of one of these calls that took place in March. Although it’s a very rudimentary example, with a robotic-sounding voice more similar to an e-book reader than a person, the call illustrates how scams could occur as AI makes it easier to imitate human voices.
The call begins with a bank employee greeting the customer. Then the automated-sounding voice says, “My card has been declined.”
“Can I ask who I’m speaking with?” responds the bank employee.
“My card has been declined,” the voice repeats.
The bank employee asks for the customer’s name again. A moment of silence is followed by faint typing sounds. According to Balasubramaniyan, the number of keystrokes corresponds to the number of letters in the customer’s name. The scammer is typing words into a program that will read them aloud.
In this case, the synthetic speech of the caller prompted the employee to transfer the call to another department and label it as potentially fraudulent, explained Balasubramaniyan.
Calls like this, using text-to-speech technology, are some of the easiest attacks to combat: call centers can use detection software to identify technical signs that the speech has been generated by a machine.
“Synthetic speech leaves traces, and many anti-falsification algorithms detect them,” explained Peter Soufleris, CEO of IngenID, a voice biometrics technology provider.
However, as with many security measures, it’s an arms race between attackers and defenders, and this has evolved in recent times. Now, a scammer can simply speak into a microphone or type a message and quickly translate it into the target’s voice.
Balasubramaniyan pointed out that Microsoft’s generative AI system, VALL-E, could create a voice imitation saying anything the user wanted based on just a three-second audio sample.
In a May episode of 60 Minutes, security consultant Rachel Tobac used software to clone the voice of Sharyn Alfonsi, one of the program’s correspondents, so convincingly that she managed to trick a 60 Minutes employee into giving her Alfonsi’s passport number.
The attack took only five minutes to execute, said Tobac, CEO of SocialProof Security. The tool she used has been available for purchase since January.
Brett Beranek, General Manager of Security and Biometrics at Nuance, a voice technology provider that Microsoft acquired in 2021, said that while terrifying deepfake demonstrations are common in security conferences, actual attacks are still very rare. The only successful attack against a Nuance client, in October, took the attacker more than a dozen attempts.
Beranek’s biggest concern isn’t attacks on call centers or automated systems, like the voice biometric systems many banks have deployed. He worries about scams where the caller goes directly to an individual.
“I had a conversation earlier this week with one of our customers,” he said. “They said, ‘Hey, Brett, it’s fantastic that we’ve got our contact center secured, but what happens if someone calls our CEO directly on their cell phone and impersonates someone else?'”
That’s what happened in Kabatznik’s case. According to the banker’s description, it seemed like the scammer was trying to convince her to transfer the money to a new destination, but the voice was repetitive, spoke over her, and used confusing phrases. The banker hung up.
“It was like talking to her, but it didn’t make sense,” Kabatznik said the banker had told him. (A Bank of America spokesperson declined to make the employee available for an interview).
Kabatznik explains that after receiving two more consecutive similar calls, the banker alerted Bank of America’s security team. Concerned about the security of Kabatznik’s account, she stopped responding to his calls and emails, even those from the real Kabatznik. It took about ten days to restore the connection when Kabatznik organized a visit to the bank office.
“We continuously train our team to identify and recognize scams and help our clients avoid them,” said William Halldin, a spokesperson for Bank of America, who added he couldn’t comment on specific customers or their experiences.
Although the attacks are becoming more sophisticated, they stem from a basic cybersecurity threat that has existed for decades: a data breach that exposes personal information of bank customers. Between 2020 and 2022, personal data of over 300 million people fell into the hands of hackers, resulting in losses of $8.8 billion, according to the Federal Trade Commission.
Once they have gathered a batch of numbers, hackers sift through the information and match it to real individuals. The ones stealing the information are almost never the same people who end up with it. Instead, the thieves put it up for sale. Specialists can use any of…