HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

TBP-Former: Learning Temporal Bird's-Eye-View Pyramid for Joint Perception and Prediction in Vision-Centric Autonomous Driving

Fang Shaoheng ; Wang Zi ; Zhong Yiqi ; Ge Junhao ; Chen Siheng ; Wang Yanfeng

TBP-Former: Learning Temporal Bird's-Eye-View Pyramid for Joint
  Perception and Prediction in Vision-Centric Autonomous Driving

Abstract

Vision-centric joint perception and prediction (PnP) has become an emergingtrend in autonomous driving research. It predicts the future states of thetraffic participants in the surrounding environment from raw RGB images.However, it is still a critical challenge to synchronize features obtained atmultiple camera views and timestamps due to inevitable geometric distortionsand further exploit those spatial-temporal features. To address this issue, wepropose a temporal bird's-eye-view pyramid transformer (TBP-Former) forvision-centric PnP, which includes two novel designs. First, apose-synchronized BEV encoder is proposed to map raw image inputs with anycamera pose at any time to a shared and synchronized BEV space for betterspatial-temporal synchronization. Second, a spatial-temporal pyramidtransformer is introduced to comprehensively extract multi-scale BEV featuresand predict future BEV states with the support of spatial-temporal priors.Extensive experiments on nuScenes dataset show that our proposed frameworkoverall outperforms all state-of-the-art vision-based prediction methods.

Benchmarks

BenchmarkMethodologyMetrics
bird-s-eye-view-semantic-segmentation-onTBP-Former
IoU ped - 224x480 - Vis filter. - 100x100 at 0.5: 18.6
bird-s-eye-view-semantic-segmentation-onTBP-Former (static)
IoU ped - 224x480 - Vis filter. - 100x100 at 0.5: 17.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
TBP-Former: Learning Temporal Bird's-Eye-View Pyramid for Joint Perception and Prediction in Vision-Centric Autonomous Driving | Papers | HyperAI