HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

GLPanoDepth: Global-to-Local Panoramic Depth Estimation

Jiayang Bai Shuichang Lai Haoyu Qin Jie Guo Yanwen Guo

GLPanoDepth: Global-to-Local Panoramic Depth Estimation

Abstract

In this paper, we propose a learning-based method for predicting dense depth values of a scene from a monocular omnidirectional image. An omnidirectional image has a full field-of-view, providing much more complete descriptions of the scene than perspective images. However, fully-convolutional networks that most current solutions rely on fail to capture rich global contexts from the panorama. To address this issue and also the distortion of equirectangular projection in the panorama, we propose Cubemap Vision Transformers (CViT), a new transformer-based architecture that can model long-range dependencies and extract distortion-free global features from the panorama. We show that cubemap vision transformers have a global receptive field at every stage and can provide globally coherent predictions for spherical signals. To preserve important local features, we further design a convolution-based branch in our pipeline (dubbed GLPanoDepth) and fuse global features from cubemap vision transformers at multiple scales. This global-to-local strategy allows us to fully exploit useful global and local features in the panorama, achieving state-of-the-art performance in panoramic depth estimation.

Code Repositories

LeoDarcy/GLPanoDepth
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
depth-estimation-on-stanford2d3d-panoramicGLPanoDepth
RMSE: 0.3493

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp