HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Efficient and Information-Preserving Future Frame Prediction and Beyond

{Yichao Lu Wei Yu Sanja Fidler Steve Easterbrook}

Efficient and Information-Preserving Future Frame Prediction and Beyond

Abstract

Applying resolution-preserving blocks is a common practice to maximize information preservation in video prediction, yet their high memory consumption greatly limits their application scenarios. We propose CrevNet, a Conditionally Reversible Network that uses reversible architectures to build a bijective two-way autoencoder and its complementary recurrent predictor. Our model enjoys the theoretically guaranteed property of no information loss during the feature extraction, much lower memory consumption and computational efficiency. The lightweight nature of our model enables us to incorporate 3D convolutions without concern of memory bottleneck, enhancing the model's ability to capture both short-term and long-term temporal dependencies. Our proposed approach achieves state-of-the-art results on Moving MNIST, Traffic4cast and KITTI datasets. We further demonstrate the transferability of our self-supervised learning method by exploiting its learnt features for object detection on KITTI. Our competitive results indicate the potential of using CrevNet as a generative pre-training strategy to guide downstream tasks.

Benchmarks

BenchmarkMethodologyMetrics
video-prediction-on-moving-mnistCrevNet+ST-LSTM
MSE: 22.3
SSIM: 0.949
video-prediction-on-moving-mnistCrevNet+ConvLSTM
MSE: 38.5
SSIM: 0.928

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp