Command Palette
Search for a command to run...
Denys Rozumnyi Martin R. Oswald Vittorio Ferrari Jiri Matas Marc Pollefeys

Abstract
Objects moving at high speed appear significantly blurred when captured with cameras. The blurry appearance is especially ambiguous when the object has complex shape or texture. In such cases, classical methods, or even humans, are unable to recover the object's appearance and motion. We propose a method that, given a single image with its estimated background, outputs the object's appearance and position in a series of sub-frames as if captured by a high-speed camera (i.e. temporal super-resolution). The proposed generative model embeds an image of the blurred object into a latent space representation, disentangles the background, and renders the sharp appearance. Inspired by the image formation model, we design novel self-supervised loss function terms that boost performance and show good generalization capabilities. The proposed DeFMO method is trained on a complex synthetic dataset, yet it performs well on real-world data from several datasets. DeFMO outperforms the state of the art and generates high-quality temporal super-resolution frames.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| video-super-resolution-on-falling-objects | DeFMO | PSNR: 26.83 SSIM: 0.753 TIoU: 0.684 |
| video-super-resolution-on-tbd | DeFMO | PSNR: 25.57 SSIM: 0.602 TIoU: 0.550 |
| video-super-resolution-on-tbd-3d | DeFMO | PSNR: 26.23 SSIM: 0.699 TIoU: 0.879 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.