HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Mutual Suppression Network for Video Prediction using Disentangled Features

Jungbeom Lee; Jangho Lee; Sungmin Lee; Sungroh Yoon

Mutual Suppression Network for Video Prediction using Disentangled Features

Abstract

Video prediction has been considered a difficult problem because the video contains not only high-dimensional spatial information but also complex temporal information. Video prediction can be performed by finding features in recent frames, and using them to generate approximations to upcoming frames. We approach this problem by disentangling spatial and temporal features in videos. We introduce a mutual suppression network (MSnet) which are trained in an adversarial manner and then produces spatial features which are free of motion information, and motion features with no spatial information. MSnet then uses motion-guided connection within an encoder-decoder-based architecture to transform spatial features from a previous frame to the time of an upcoming frame. We show how MSnet can be used for video prediction using disentangled representations. We also carry out experiments to assess the effectiveness of our method to disentangle features. MSnet obtains better results than other recent video prediction methods even though it has simpler encoders.

Benchmarks

BenchmarkMethodologyMetrics
video-prediction-on-kthMSNET
Cond: 10
PSNR: 27.08
Pred: 20
SSIM: 0.876

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Mutual Suppression Network for Video Prediction using Disentangled Features | Papers | HyperAI