Command Palette
Search for a command to run...
{Hsin-Min Wang Yu Tsao Chia Wen Lin Wasim Ahmad Sahibzada Adil Shahzad Ammarah Hashmi}

Abstract
The recent rapid revolution in Artificial Intelligence (AI) technology has enabled the creation of hyper-realistic deepfakes, and detecting deepfake videos (also known as AIsynthesized videos) has become a critical task. The existing systems generally do not fully consider the unified processing of audio and video data, so there is still room for further improvement. In this paper, we focus on the multimodal forgery detection task and propose a deep forgery detection method based on audiovisual ensemble learning. The proposed method consists of four parts, namely a Video Network, an Audio Network, an Audiovisual Network, and a Voting Module. Given a video, the proposed multimodal and ensemble learning system can identify whether it is fake or real. Experimental results on a recently released multimodal FakeAVCeleb dataset show that the proposed method achieves 89% accuracy, significantly outperforming existing models.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| multimodal-forgery-detection-on-fakeavceleb | Ensemble AudioVisual Model | Accuracy (%): 0.89 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.