5 months ago

Noisy-Correspondence Learning for Text-to-Image Person Re-identification

Qin Yang ; Chen Yingke ; Peng Dezhong ; Peng Xi ; Zhou Joey Tianyi ; Hu Peng

Abstract

Text-to-image person re-identification (TIReID) is a compelling topic in thecross-modal community, which aims to retrieve the target person based on atextual query. Although numerous TIReID methods have been proposed and achievedpromising performance, they implicitly assume the training image-text pairs arecorrectly aligned, which is not always the case in real-world scenarios. Inpractice, the image-text pairs inevitably exist under-correlated or evenfalse-correlated, a.k.a noisy correspondence (NC), due to the low quality ofthe images and annotation errors. To address this problem, we propose a novelRobust Dual Embedding method (RDE) that can learn robust visual-semanticassociations even with NC. Specifically, RDE consists of two main components:1) A Confident Consensus Division (CCD) module that leverages the dual-graineddecisions of dual embedding modules to obtain a consensus set of clean trainingdata, which enables the model to learn correct and reliable visual-semanticassociations. 2) A Triplet Alignment Loss (TAL) relaxes the conventionalTriplet Ranking loss with the hardest negative samples to a log-exponentialupper bound over all negative ones, thus preventing the model collapse under NCand can also focus on hard-negative samples for promising performance. Weconduct extensive experiments on three public benchmarks, namely CUHK-PEDES,ICFG-PEDES, and RSTPReID, to evaluate the performance and robustness of ourRDE. Our method achieves state-of-the-art results both with and withoutsynthetic noisy correspondences on all three datasets. Code is available athttps://github.com/QinYang79/RDE.

Code Repositories

QinYang79/RDE

Official

pytorch

Benchmarks

Benchmark	Methodology	Metrics
nlp-based-person-retrival-on-cuhk-pedes	RDE	R@1: 75.94 R@10: 94.12 R@5: 90.63 mAP: 67.56 mINP: 51.44
text-based-person-retrieval-on-icfg-pedes	RDE	R@1: 67.68 R@10: 87.36 R@5: 82.47 mAP: 40.06 mINP: 7.87
text-based-person-retrieval-on-rstpreid-1	RDE	R@1: 65.35 R@10: 89.90 R@5: 83.95 mAP: 50.88 mINP: 28.08
text-based-person-retrieval-with-noisy	RDE	Rank 10: 93.63 Rank-1: 74.46 Rank-5: 89.42 mAP: 66.13 mINP: 49.66
text-based-person-retrieval-with-noisy-1	RDE	Rank 1: 66.54 Rank-10: 86.70 Rank-5: 81.70 mAP: 39.08 mINP: 7.55
text-based-person-retrieval-with-noisy-2	RDE	Rank 1: 64.45 Rank 10: 90.00 Rank 5: 83.50 mAP: 49.78 mINP: 27.43

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette