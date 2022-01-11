Last year, a series of videos surfaced of a simulated Tom Cruise that took social media by storm.

They were deepfakes – a form of digital fabrication powered by artificial intelligence, underpinned by ‘deep learning’ algorithms that learn the movements or sounds of two different recordings and combine them to produce realistic-looking fake media.

There are two kinds of deepfakes: Video deepfakes, which reproduce the look and voice of an actual person, and audio deepfakes, which imitate a person’s voice. While deepfake detection software has received a lot of attention, they have mainly focused on analysing image files.

Now, researchers have developed a deepfake audio detection method designed to spot increasingly realistic audio deepfakes.

To do so, Joel Frank and Lea Schonherr, from the Horst Gortz Institute for IT Security at Ruhr-Universitat Bochum, amassed around 118,000 samples of synthesised audio voice recordings that amounted to almost 196 hours of fake voice recordings in both English and Japanese.

“Such a dataset for audio deepfakes did not exist before,” explained Schonherr in a press release announcing the new method. “But in order to improve the methods for detecting fake audio files, you need all this material.”

To ensure the dataset was diverse, the team used six different AI algorithms when generating the audio snippets. Each artificial audio file was then compared with recordings of real speech after researchers plotted their frequency distribution as spectrograms and patterns began to emerge.