HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

An Empirical Study of CLIP for Text-based Person Search

Cao Min ; Bai Yang ; Zeng Ziyin ; Ye Mang ; Zhang Min

An Empirical Study of CLIP for Text-based Person Search

Abstract

Text-based Person Search (TBPS) aims to retrieve the person images usingnatural language descriptions. Recently, Contrastive Language Image Pretraining(CLIP), a universal large cross-modal vision-language pre-training model, hasremarkably performed over various cross-modal downstream tasks due to itspowerful cross-modal semantic learning capacity. TPBS, as a fine-grainedcross-modal retrieval task, is also facing the rise of research on theCLIP-based TBPS. In order to explore the potential of the visual-languagepre-training model for downstream TBPS tasks, this paper makes the firstattempt to conduct a comprehensive empirical study of CLIP for TBPS and thuscontribute a straightforward, incremental, yet strong TBPS-CLIP baseline to theTBPS community. We revisit critical design considerations under CLIP, includingdata augmentation and loss function. The model, with the aforementioned designsand practical training tricks, can attain satisfactory performance without anysophisticated modules. Also, we conduct the probing experiments of TBPS-CLIP inmodel generalization and model compression, demonstrating the effectiveness ofTBPS-CLIP from various aspects. This work is expected to provide empiricalinsights and highlight future CLIP-based TBPS research.

Code Repositories

flame-chasers/tbps-clip
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
nlp-based-person-retrival-on-cuhk-pedesTBPS-CLIP (ViT-B/16)
R@1: 73.54
R@10: 92.35
R@5: 88.19
mAP: 65.38
text-based-person-retrieval-on-icfg-pedesTBPS-CLIP (ViT-B/16)
R@1: 65.05
R@10: 85.47
R@5: 80.34
mAP: 39.83
text-based-person-retrieval-on-rstpreid-1TBPS-CLIP (ViT-B/16)
R@1: 61.95
R@10: 88.75
R@5: 83.55
mAP: 48.26

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp