3 months ago

Unsupervised Video Summarization via Attention-Driven Adversarial Learning

{Ioannis Patras Vasileios Mezaris Alexandros I. Metsai Eleni Adamantidou Evlampios Apostolidis}

Abstract

This paper presents a new video summarization approach that integrates an attention mechanism to identify the significant parts of the video, and is trained unsupervisingly via generative adversarial learning. Starting from the SUM-GAN model, we first develop an improved version of it (called SUM-GAN-sl) that has a significantly reduced number of learned parameters, performs incremental training of the model’s components, and applies a stepwise label-based strategy for updating the adversarial part. Subsequently, we introduce an attention mechanism to SUM-GAN-sl in two ways: (i) by integrating an attention layer within the variational auto-encoder (VAE) of the architecture (SUM-GAN-VAAE), and (ii) by replacing the VAE with a deterministic attention auto-encoder (SUM-GAN-AAE). Experimental evaluation on two datasets (SumMe and TVSum) documents the contribution of the attention auto-encoder to faster and more stable training of the model, resulting in a significant performance improvement with respect to the original model and demonstrating the competitiveness of the proposed SUM-GAN-AAE against the state of the art.

Benchmarks

Benchmark	Methodology	Metrics
unsupervised-video-summarization-on-summe	SUM-GAN-AAE	F1-score: 48.9 Parameters (M): 24.31 training time (s): 1639
unsupervised-video-summarization-on-tvsum	SUM-GAN-AAE	F1-score: 58.3 Parameters (M): 24.31 training time (s): 5423

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning