HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

RSTT: Real-time Spatial Temporal Transformer for Space-Time Video Super-Resolution

Zhicheng Geng; Luming Liang; Tianyu Ding; Ilya Zharkov

RSTT: Real-time Spatial Temporal Transformer for Space-Time Video Super-Resolution

Abstract

Space-time video super-resolution (STVSR) is the task of interpolating videos with both Low Frame Rate (LFR) and Low Resolution (LR) to produce High-Frame-Rate (HFR) and also High-Resolution (HR) counterparts. The existing methods based on Convolutional Neural Network~(CNN) succeed in achieving visually satisfied results while suffer from slow inference speed due to their heavy architectures. We propose to resolve this issue by using a spatial-temporal transformer that naturally incorporates the spatial and temporal super resolution modules into a single model. Unlike CNN-based methods, we do not explicitly use separated building blocks for temporal interpolations and spatial super-resolutions; instead, we only use a single end-to-end transformer architecture. Specifically, a reusable dictionary is built by encoders based on the input LFR and LR frames, which is then utilized in the decoder part to synthesize the HFR and HR frames. Compared with the state-of-the-art TMNet \cite{xu2021temporal}, our network is $60\%$ smaller (4.5M vs 12.3M parameters) and $80\%$ faster (26.2fps vs 14.3fps on $720\times576$ frames) without sacrificing much performance. The source code is available at https://github.com/llmpass/RSTT.

Code Repositories

llmpass/RSTT
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
space-time-video-super-resolution-on-vimeo90kRSTT-M
PSNR: 36.78
SSIM: 0.9401
space-time-video-super-resolution-on-vimeo90kRSTT-L
PSNR: 36.80
SSIM: 0.9403
space-time-video-super-resolution-on-vimeo90kRSTT-S
PSNR: 36.58
SSIM: 0.9381
space-time-video-super-resolution-on-vimeo90k-1RSTT-M
PSNR: 35.62
SSIM: 0.9377
space-time-video-super-resolution-on-vimeo90k-1RSTT-S
PSNR: 35.43
SSIM: 0.9358
space-time-video-super-resolution-on-vimeo90k-1RSTT-L
PSNR: 35.66
SSIM: 0.9381
video-frame-interpolation-on-vid4-4xRSTT-S
PSNR: 26.29
Parameters: 4490000
SSIM: 0.7941
video-frame-interpolation-on-vid4-4xRSTT-L
PSNR: 26.43
Parameters: 7670000
SSIM: 0.7994
video-frame-interpolation-on-vid4-4xRSTT-M
PSNR: 26.37
Parameters: 6080000
SSIM: 0.7978

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp