HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

ADA-VAD: Unpaired Adversarial Domain Adaptation for Noise-Robust Voice Activity Detection

{Jong Hwan Ko Jiho Chang Taesoo Kim}

ADA-VAD: Unpaired Adversarial Domain Adaptation for Noise-Robust Voice Activity Detection

Abstract

Voice Activity Detection (VAD) is becoming an essential front-end component in various speech processing systems. As those systems are commonly deployed in environments with diverse noise types and low signal-to-noise ratios (SNRs), an effective VAD method should perform robust detection of speech region out of noisy background signals. In this paper, we propose adversarial domain adaptive VAD (ADA-VAD), which is a deep neural network (DNN) based VAD method highly robust to audio samples with various noise types and low SNRs. The proposed method trains DNN models for a VAD task in a supervised manner. Simultaneously, to mitigate the performance degradation due to back-ground noises, the adversarial domain adaptation method is adopted to match the domain discrepancy between noisy and clean audio stream in an unsupervised manner. The results show that ADA-VAD achieves an average of 3.6%p and 7%p higher AUC than models trained with manually extracted features on the AVA-speech dataset and a speech database synthesized with an unseen noise database, respectively.

Benchmarks

BenchmarkMethodologyMetrics
activity-detection-on-ava-speechADA-VAD
ROC-AUC: 79.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
ADA-VAD: Unpaired Adversarial Domain Adaptation for Noise-Robust Voice Activity Detection | Papers | HyperAI