Command Palette
Search for a command to run...
Unsupervised Video Summarization via Attention-Driven Adversarial Learning
{Ioannis Patras Vasileios Mezaris Alexandros I. Metsai Eleni Adamantidou Evlampios Apostolidis}
Abstract
This paper presents a new video summarization approach that integrates an attention mechanism to identify the significant parts of the video, and is trained unsupervisingly via generative adversarial learning. Starting from the SUM-GAN model, we first develop an improved version of it (called SUM-GAN-sl) that has a significantly reduced number of learned parameters, performs incremental training of the model’s components, and applies a stepwise label-based strategy for updating the adversarial part. Subsequently, we introduce an attention mechanism to SUM-GAN-sl in two ways: (i) by integrating an attention layer within the variational auto-encoder (VAE) of the architecture (SUM-GAN-VAAE), and (ii) by replacing the VAE with a deterministic attention auto-encoder (SUM-GAN-AAE). Experimental evaluation on two datasets (SumMe and TVSum) documents the contribution of the attention auto-encoder to faster and more stable training of the model, resulting in a significant performance improvement with respect to the original model and demonstrating the competitiveness of the proposed SUM-GAN-AAE against the state of the art.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| unsupervised-video-summarization-on-summe | SUM-GAN-AAE | F1-score: 48.9 Parameters (M): 24.31 training time (s): 1639 |
| unsupervised-video-summarization-on-tvsum | SUM-GAN-AAE | F1-score: 58.3 Parameters (M): 24.31 training time (s): 5423 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.