Command Palette
Search for a command to run...
Context-Semantic Quality Awareness Network for Fine-Grained Visual Categorization
Xu Qin ; Li Sitong ; Wang Jiahui ; Jiang Bo ; Tang Jinhui

Abstract
Exploring and mining subtle yet distinctive features between sub-categorieswith similar appearances is crucial for fine-grained visual categorization(FGVC). However, less effort has been devoted to assessing the quality ofextracted visual representations. Intuitively, the network may struggle tocapture discriminative features from low-quality samples, which leads to asignificant decline in FGVC performance. To tackle this challenge, we propose aweakly supervised Context-Semantic Quality Awareness Network (CSQA-Net) forFGVC. In this network, to model the spatial contextual relationship betweenrich part descriptors and global semantics for capturing more discriminativedetails within the object, we design a novel multi-part and multi-scalecross-attention (MPMSCA) module. Before feeding to the MPMSCA module, the partnavigator is developed to address the scale confusion problems and accuratelyidentify the local distinctive regions. Furthermore, we propose a genericmulti-level semantic quality evaluation module (MLSQE) to progressivelysupervise and enhance hierarchical semantics from different levels of thebackbone network. Finally, context-aware features from MPMSCA and semanticallyenhanced features from MLSQE are fed into the corresponding quality probingclassifiers to evaluate their quality in real-time, thus boosting thediscriminability of feature representations. Comprehensive experiments on fourpopular and highly competitive FGVC datasets demonstrate the superiority of theproposed CSQA-Net in comparison with the state-of-the-art methods.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| fine-grained-image-classification-on-cub-200 | CSQA-Net | Accuracy: 92.6% |
| fine-grained-image-classification-on-fgvc | CSQA-Net | Accuracy: 94.7% |
| fine-grained-image-classification-on-nabirds | CSQA-Net | Accuracy: 92.3% |
| fine-grained-image-classification-on-stanford | CSQA-Net | Accuracy: 95.6% |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.