HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

ChangeCLIP: Remote sensing change detection with multimodal vision-language representation learning

{Xiaoliang Meng Bo Du Libo Wang Sijun Dong}

Abstract

Remote sensing change detection (RSCD), which aims to identify surface changes from bitemporal images, is significant for many applications, such as environmental protection and disaster monitoring. In the last decade, driven by the wave of artificial intelligence, many change detection methods based on deep learning emerged and have achieved essential breakthroughs. However, these methods pay more attention to visual representation learning while ignoring the potential of multimodal data. Recently, the foundation vision-language model, i.e. CLIP, has provided a new paradigm for multimodal AI, demonstrating impressive performance on downstream tasks. Following this trend, in this study, we introduce ChangeCLIP, a novel framework that leverages robust semantic information from image-text pairs, specifically tailored for Remote Sensing Change Detection (RSCD). Specifically, we reconstruct the original CLIP to extract bitemporal features and propose a novel differential features compensation module to capture the detailed semantic changes between them. Besides, we proposed a vision-language-driven decoder by combining the results of image-text encoding with the visual features of the decoding stage, thereby enhancing the image semantics. The proposed ChangeCLIP achieved state-of-the-art IoU on 5 well-known change detection datasets, LEVIR-CD (85.20%), LEVIR-CD+ (75.63%), WHUCD (90.15%), CDD (95.87%) and SYSU-CD (71.41%). The code and the pretrained models of ChangeCLIP will be publicly available on https://github.com/dyzy41/ChangeCLIP.

Benchmarks

BenchmarkMethodologyMetrics
change-detection-on-cdd-dataset-season-1ChangeCLIP
F1: 97.89
F1-Score: 97.89
IoU: 95.87
Overall Accuracy: 99.48
Precision: 98.02
Recall: 97.77
change-detection-on-levir-cdChangeCLIP
F1: 92.01
IoU: 85.20
Overall Accuracy: 99.20
Precision: 93.40
Recall: 90.67

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
ChangeCLIP: Remote sensing change detection with multimodal vision-language representation learning | Papers | HyperAI