5 months ago

An Empirical Study of CLIP for Text-based Person Search

Cao Min ; Bai Yang ; Zeng Ziyin ; Ye Mang ; Zhang Min

Abstract

Text-based Person Search (TBPS) aims to retrieve the person images usingnatural language descriptions. Recently, Contrastive Language Image Pretraining(CLIP), a universal large cross-modal vision-language pre-training model, hasremarkably performed over various cross-modal downstream tasks due to itspowerful cross-modal semantic learning capacity. TPBS, as a fine-grainedcross-modal retrieval task, is also facing the rise of research on theCLIP-based TBPS. In order to explore the potential of the visual-languagepre-training model for downstream TBPS tasks, this paper makes the firstattempt to conduct a comprehensive empirical study of CLIP for TBPS and thuscontribute a straightforward, incremental, yet strong TBPS-CLIP baseline to theTBPS community. We revisit critical design considerations under CLIP, includingdata augmentation and loss function. The model, with the aforementioned designsand practical training tricks, can attain satisfactory performance without anysophisticated modules. Also, we conduct the probing experiments of TBPS-CLIP inmodel generalization and model compression, demonstrating the effectiveness ofTBPS-CLIP from various aspects. This work is expected to provide empiricalinsights and highlight future CLIP-based TBPS research.

Code Repositories

flame-chasers/tbps-clip

Official

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
nlp-based-person-retrival-on-cuhk-pedes	TBPS-CLIP (ViT-B/16)	R@1: 73.54 R@10: 92.35 R@5: 88.19 mAP: 65.38
text-based-person-retrieval-on-icfg-pedes	TBPS-CLIP (ViT-B/16)	R@1: 65.05 R@10: 85.47 R@5: 80.34 mAP: 39.83
text-based-person-retrieval-on-rstpreid-1	TBPS-CLIP (ViT-B/16)	R@1: 61.95 R@10: 88.75 R@5: 83.55 mAP: 48.26

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette