HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint Learning

Yi Feng Zizhan Guo Qijun Chen Rui Fan

SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint Learning

Abstract

Unsupervised monocular depth estimation frameworks have shown promising performance in autonomous driving. However, existing solutions primarily rely on a simple convolutional neural network for ego-motion recovery, which struggles to estimate precise camera poses in dynamic, complicated real-world scenarios. These inaccurately estimated camera poses can inevitably deteriorate the photometric reconstruction and mislead the depth estimation networks with wrong supervisory signals. In this article, we introduce SCIPaD, a novel approach that incorporates spatial clues for unsupervised depth-pose joint learning. Specifically, a confidence-aware feature flow estimator is proposed to acquire 2D feature positional translations and their associated confidence levels. Meanwhile, we introduce a positional clue aggregator, which integrates pseudo 3D point clouds from DepthNet and 2D feature flows into homogeneous positional representations. Finally, a hierarchical positional embedding injector is proposed to selectively inject spatial clues into semantic features for robust camera pose decoding. Extensive experiments and analyses demonstrate the superior performance of our model compared to other state-of-the-art methods. Remarkably, SCIPaD achieves a reduction of 22.2\% in average translation error and 34.8\% in average angular error for camera pose estimation task on the KITTI Odometry dataset. Our source code is available at \url{https://mias.group/SCIPaD}.

Code Repositories

fengyi233/SCIPaD
Official
pytorch

Benchmarks

BenchmarkMethodologyMetrics
camera-pose-estimation-on-kitti-odometrySCIPaD
Absolute Trajectory Error [m]: 20.83
Average Rotational Error er[%]: 3.17
Average Translational Error et[%]: 8.63
monocular-depth-estimation-on-kitti-eigen-1SCIPaD(M+640x192)
Delta u003c 1.25: 0.897
Delta u003c 1.25^2: 0.964
Delta u003c 1.25^3: 0.983
Mono: O
RMSE: 4.391
RMSE log: 0.175
Resolution: 640x192
Sq Rel: 0.732
absolute relative error: 0.098
monocular-depth-estimation-on-kitti-eigen-1SCIPaD
Delta u003c 1.25: 0.918
Delta u003c 1.25^2: 0.970
Delta u003c 1.25^3: 0.985
RMSE: 4.056
RMSE log: 0.166
Resolution: 640x192
Sq Rel: 0.650
absolute relative error: 0.090

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp