HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models

Yubin Wang Xinyang Jiang De Cheng Dongsheng Li Cairong Zhao

Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models

Abstract

Prompt learning has become a prevalent strategy for adapting vision-language foundation models to downstream tasks. As large language models (LLMs) have emerged, recent studies have explored the use of category-related descriptions as input to enhance prompt effectiveness. Nevertheless, conventional descriptions fall short of structured information that effectively represents the interconnections among entities or attributes linked to a particular category. To address this limitation and prioritize harnessing structured knowledge, this paper advocates for leveraging LLMs to build a graph for each description to model the entities and attributes describing the category, as well as their correlations. Preexisting prompt tuning methods exhibit inadequacies in managing this structured knowledge. Consequently, we propose a novel approach called Hierarchical Prompt Tuning (HPT), which enables simultaneous modeling of both structured and conventional linguistic knowledge. Specifically, we introduce a relationship-guided attention module to capture pair-wise associations among entities and attributes for low-level prompt learning. In addition, by incorporating high-level and global-level prompts modeling overall semantics, the proposed hierarchical structure forges cross-level interlinks and empowers the model to handle more complex and long-term relationships. Extensive experiments demonstrate that our HPT shows strong effectiveness and generalizes much better than existing SOTA methods. Our code is available at https://github.com/Vill-Lab/2024-AAAI-HPT.

Code Repositories

vill-lab/2024-aaai-hpt
Official
pytorch
Mentioned in GitHub
ThomasWangY/2024-AAAI-HPT
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
prompt-engineering-on-caltech-101HPT
Harmonic mean: 96.65
prompt-engineering-on-dtdHPT
Harmonic mean: 72.16
prompt-engineering-on-eurosatHPT
Harmonic mean: 84.82
prompt-engineering-on-fgvc-aircraftHPT
Harmonic mean: 40.28
prompt-engineering-on-food-101HPT
Harmonic mean: 91.01
prompt-engineering-on-imagenetHPT
Harmonic mean: 74.17
prompt-engineering-on-imagenet-aHPT
Top-1 accuracy %: 50.85
prompt-engineering-on-imagenet-rHPT
Top-1 accuracy %: 77.38
prompt-engineering-on-imagenet-sHPT
Top-1 accuracy %: 49.36
prompt-engineering-on-imagenet-v2HPT
Top-1 accuracy %: 65.25
prompt-engineering-on-oxford-102-flowerHPT
Harmonic mean: 87.16
prompt-engineering-on-oxford-iiit-pet-datasetHPT
Harmonic mean: 96.71
prompt-engineering-on-stanford-cars-1HPT
Harmonic mean: 75.57
prompt-engineering-on-sun397HPT
Harmonic mean: 80.88
prompt-engineering-on-ucf101HPT
Harmonic mean: 83.16

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp