HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Stochastic Adversarial Video Prediction

Alex X. Lee; Richard Zhang; Frederik Ebert; Pieter Abbeel; Chelsea Finn; Sergey Levine

Stochastic Adversarial Video Prediction

Abstract

Being able to predict what may happen in the future requires an in-depth understanding of the physical and causal rules that govern the world. A model that is able to do so has a number of appealing applications, from robotic planning to representation learning. However, learning to predict raw future observations, such as frames in a video, is exceedingly challenging -- the ambiguous nature of the problem can cause a naively designed model to average together possible futures into a single, blurry prediction. Recently, this has been addressed by two distinct approaches: (a) latent variational variable models that explicitly model underlying stochasticity and (b) adversarially-trained models that aim to produce naturalistic images. However, a standard latent variable model can struggle to produce realistic results, and a standard adversarially-trained model underutilizes latent variables and fails to produce diverse predictions. We show that these distinct methods are in fact complementary. Combining the two produces predictions that look more realistic to human raters and better cover the range of possible futures. Our method outperforms prior and concurrent work in these aspects.

Code Repositories

alexlee-gk/video_prediction
Official
tf
Mentioned in GitHub
MIT-Omnipush/video-prediction
tf
Mentioned in GitHub
Bonennult/video_prediction
tf
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
video-generation-on-bair-robot-pushingSAVP (from SRVP)
Cond: 2
FVD score: 152±9
LPIPS: 0.0634±0.0026
PSNR: 18.44±0.25
Pred: 28
SSIM: 0.7887±0.0092
Train: 12
video-generation-on-bair-robot-pushingSAVP (from vRNN)
Cond: 2
FVD score: 143.43
LPIPS: 0.062±0.03
Pred: 28
SSIM: 0.795±0.07
Train: 10
video-generation-on-bair-robot-pushingSAVP (from FVD)
Cond: 2
FVD score: 116.4
Pred: 14
Train: 14
video-generation-on-bair-robot-pushingSAVP-VAE (from WAM)
Cond: 2
PSNR: 19.09
Pred: 28
SSIM: 0.815
Train: 14
video-prediction-on-kthSAVP-VAE
Cond: 10
PSNR: 27.77
Pred: 20
SSIM: 0.852
video-prediction-on-kthSAVP-VAE (from Grid-keypoints)
Cond: 10
FVD: 145.7
LPIPS: 0.116
PSNR: 26.00
Params (M): 7.3
Pred: 40
SSIM: 0.806
Train: 10
video-prediction-on-kthSAVP (from Grid-keypoints)
Cond: 10
FVD: 183.7
LPIPS: 0.126
PSNR: 23.79
Params (M): 17.6
Pred: 40
SSIM: 0.699
Train: 10
video-prediction-on-kthSAVP (from SRVP)
Cond: 10
FVD: 374 ± 3
LPIPS: 0.1120±0.0039
PSNR: 26.51±0.29
Pred: 30
SSIM: 0.7564±0.0062
Train: 10

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp