Command Palette
Search for a command to run...
ICTCAS-UCAS-TAL Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2021
{Shiguang Shan Zhongqin Wu Xiao Liu Shuang Yang Susan Liang Yuanhang Zhang}

Abstract
This report presents a brief description of our method for the AVA Active Speaker Detection (ASD) task at ActivityNetChallenge 2021. Our solution, the Extended Unified Context Network (Extended UniCon) is based on a novel UnifiedContext Network (UniCon) designed for robust ASD, which combines multiple types of contextual information to optimize all candidates jointly. We propose a few changes to the original UniCon in terms of audio features, temporal modeling architecture, and loss function design. Together, our best model ensemble sets a new state-of-the-art at 93.4% mAP on the AVA-ActiveSpeaker test set without any form of pretraining, and currently ranks first on the ActivityNet challenge leaderboard.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| audio-visual-active-speaker-detection-on-ava | Extended UniCon | validation mean average precision: 93.6% |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.