HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

REPAIR: Rank Correlation and Noisy Pair Half-replacing with Memory for Noisy Correspondence

Zheng Ruochen ; Hong Jiahao ; Gao Changxin ; Sang Nong

REPAIR: Rank Correlation and Noisy Pair Half-replacing with Memory for
  Noisy Correspondence

Abstract

The presence of noise in acquired data invariably leads to performancedegradation in cross-modal matching. Unfortunately, obtaining preciseannotations in the multimodal field is expensive, which has prompted somemethods to tackle the mismatched data pair issue in cross-modal matchingcontexts, termed as noisy correspondence. However, most of these existing noisycorrespondence methods exhibit the following limitations: a) the problem ofself-reinforcing error accumulation, and b) improper handling of noisy datapair. To tackle the two problems, we propose a generalized framework termed asRank corrElation and noisy Pair hAlf-replacing wIth memoRy (REPAIR), whichbenefits from maintaining a memory bank for features of matched pairs.Specifically, we calculate the distances between the features in the memorybank and those of the target pair for each respective modality, and use therank correlation of these two sets of distances to estimate the softcorrespondence label of the target pair. Estimating soft correspondence basedon memory bank features rather than using a similarity network can avoid theaccumulation of errors due to incorrect network identifications. For pairs thatare completely mismatched, REPAIR searches the memory bank for the mostmatching feature to replace one feature of one modality, instead of using theoriginal pair directly or merely discarding the mismatched pair. We conductexperiments on three cross-modal datasets, i.e., Flickr30K, MSCOCO, and CC152K,proving the effectiveness and robustness of our REPAIR on synthetic andreal-world noise.

Benchmarks

BenchmarkMethodologyMetrics
cross-modal-retrieval-with-noisy-1REPAIR
Image-to-text R@1: 40.5
Image-to-text R@10: 76.1
Image-to-text R@5: 67.7
R-Sum: 369.2
Text-to-image R@1: 40.3
Text-to-image R@10: 76.4
Text-to-image R@5: 68.2
cross-modal-retrieval-with-noisy-2REPAIR
Image-to-text R@1: 79.2
Image-to-text R@10: 96.9
Image-to-text R@5: 95.0
R-Sum: 504.4
Text-to-image R@1: 59.4
Text-to-image R@10: 89.5
Text-to-image R@5: 84.4
cross-modal-retrieval-with-noisy-3REPAIR
Image-to-text R@1: 78.3
Image-to-text R@10: 98.3
Image-to-text R@5: 96.8
R-Sum: 521.2
Text-to-image R@1: 62.5
Text-to-image R@10: 95.5
Text-to-image R@5: 89.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp