5 months ago

AsymFormer: Asymmetrical Cross-Modal Representation Learning for Mobile Platform Real-Time RGB-D Semantic Segmentation

Siqi Du; Weixi Wang; Renzhong Guo; Ruisheng Wang; Yibin Tian; Shengjun Tang

Abstract

Understanding indoor scenes is crucial for urban studies. Considering the dynamic nature of indoor environments, effective semantic segmentation requires both real-time operation and high accuracy.To address this, we propose AsymFormer, a novel network that improves real-time semantic segmentation accuracy using RGB-D multi-modal information without substantially increasing network complexity. AsymFormer uses an asymmetrical backbone for multimodal feature extraction, reducing redundant parameters by optimizing computational resource distribution. To fuse asymmetric multimodal features, a Local Attention-Guided Feature Selection (LAFS) module is used to selectively fuse features from different modalities by leveraging their dependencies. Subsequently, a Cross-Modal Attention-Guided Feature Correlation Embedding (CMA) module is introduced to further extract cross-modal representations. The AsymFormer demonstrates competitive results with 54.1% mIoU on NYUv2 and 49.1% mIoU on SUNRGBD. Notably, AsymFormer achieves an inference speed of 65 FPS (79 FPS after implementing mixed precision quantization) on RTX3090, demonstrating that AsymFormer can strike a balance between high accuracy and efficiency.

Code Repositories

Fourier7754/AsymFormer

Official

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
real-time-semantic-segmentation-on-nyu-depth-1	AsymFormer	Speed (FPS): 65.5 (3090) Speed(ms/f): 15.3 mIoU: 54.1
semantic-segmentation-on-nyu-depth-v2	AsymFormer	Mean IoU: 55.3%
semantic-segmentation-on-sun-rgbd	DFormer-B	Mean IoU: 49.1%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette