Command Palette
Search for a command to run...
Transformer-based Dual Relation Graph for Multi-label Image Recognition
Zhao Jiawei ; Yan Ke ; Zhao Yifan ; Guo Xiaowei ; Huang Feiyue ; Li Jia

Abstract
The simultaneous recognition of multiple objects in one image remains achallenging task, spanning multiple events in the recognition field such asvarious object scales, inconsistent appearances, and confused inter-classrelationships. Recent research efforts mainly resort to the statistic labelco-occurrences and linguistic word embedding to enhance the unclear semantics.Different from these researches, in this paper, we propose a novelTransformer-based Dual Relation learning framework, constructing complementaryrelationships by exploring two aspects of correlation, i.e., structuralrelation graph and semantic relation graph. The structural relation graph aimsto capture long-range correlations from object context, by developing across-scale transformer-based architecture. The semantic graph dynamicallymodels the semantic meanings of image objects with explicit semantic-awareconstraints. In addition, we also incorporate the learnt structuralrelationship into the semantic graph, constructing a joint relation graph forrobust representations. With the collaborative learning of these two effectiverelation graphs, our approach achieves new state-of-the-art on two popularmulti-label recognition benchmarks, i.e., MS-COCO and VOC 2007 dataset.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| multi-label-classification-on-ms-coco | TDRG-R101(448×448) | mAP: 84.6 |
| multi-label-classification-on-ms-coco | TDRG-R101(576×576) | mAP: 86.0 |
| multi-label-classification-on-pascal-voc-2007 | TDRG-R101(448×448) | mAP: 95.0 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.