HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

CoLA: Conditional Dropout and Language-driven Robust Dual-modal Salient Object Detection

Shuang Hao; Chunlin Zhong; He Tang

CoLA: Conditional Dropout and Language-driven Robust Dual-modal Salient Object Detection

Abstract

The depth/thermal information is beneficial for detecting salient object with conventional RGB images. However, in dual-modal salient object detection (SOD) model, the robustness against noisy inputs and modality missing is crucial but rarely studied. To tackle this problem, we introduce \textbf{Co}nditional Dropout and \textbf{LA}nguage-driven(\textbf{CoLA}) framework comprising two core components. 1) Language-driven Quality Assessment (LQA): Leveraging a pretrained vision-language model with a prompt learner, the LQA recalibrates image contributions without requiring additional quality annotations. This approach effectively mitigates the impact of noisy inputs. 2) Conditional Dropout (CD): A learning method to strengthen the model's adaptability in scenarios with missing modalities, while preserving its performance under complete modalities. The CD serves as a plug-in training scheme that treats modality-missing as conditions, strengthening the overall robustness of various dual-modal SOD models. Extensive experiments demonstrate that the proposed method outperforms state-of-the-art dual-modal SOD models, under both modality-complete and modality-missing conditions. We will release source code upon acceptance.

Code Repositories

ssecv/CoLA
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
rgb-d-salient-object-detection-on-desCoLANet
Average MAE: 0.018
S-Measure: 93.5
max E-Measure: 96.3
max F-Measure: 92.5
rgb-d-salient-object-detection-on-nju2kCoLANet
Average MAE: 0.029
S-Measure: 93.4
max E-Measure: 94.7
max F-Measure: 91.3
rgb-d-salient-object-detection-on-nlprCoLANet
Average MAE: 0.021
S-Measure: 93.5
max E-Measure: 95.7
max F-Measure: 90.9
rgb-d-salient-object-detection-on-sipCoLANet
Average MAE: 0.042
S-Measure: 89.5
max E-Measure: 93.5
max F-Measure: 89.4
rgb-d-salient-object-detection-on-stereCoLANet
Average MAE: 0.039
S-Measure: 90.8
max E-Measure: 94.1
max F-Measure: 88.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp