HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Real-time Detection of AI-Generated Speech for DeepFake Voice Conversion

Bird Jordan J. ; Lotfi Ahmad

Real-time Detection of AI-Generated Speech for DeepFake Voice Conversion

Abstract

There are growing implications surrounding generative AI in the speech domainthat enable voice cloning and real-time voice conversion from one individual toanother. This technology poses a significant ethical threat and could lead tobreaches of privacy and misrepresentation, thus there is an urgent need forreal-time detection of AI-generated speech for DeepFake Voice Conversion. Toaddress the above emerging issues, the DEEP-VOICE dataset is generated in thisstudy, comprised of real human speech from eight well-known figures and theirspeech converted to one another using Retrieval-based Voice Conversion.Presenting as a binary classification problem of whether the speech is real orAI-generated, statistical analysis of temporal audio features through t-testingreveals that there are significantly different distributions. Hyperparameteroptimisation is implemented for machine learning models to identify the sourceof speech. Following the training of 208 individual machine learning modelsover 10-fold cross validation, it is found that the Extreme Gradient Boostingmodel can achieve an average classification accuracy of 99.3% and can classifyspeech in real-time, at around 0.004 milliseconds given one second of speech.All data generated for this study is released publicly for future research onAI speech detection.

Benchmarks

BenchmarkMethodologyMetrics
audio-classification-on-deep-voice-deepfakeXGBoost (330)
Accuracy (10-fold): 99.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Real-time Detection of AI-Generated Speech for DeepFake Voice Conversion | Papers | HyperAI