Command Palette
Search for a command to run...
Xiaolin Zhang; Yunchao Wei; Guoliang Kang; Yi Yang; Thomas Huang

Abstract
Weakly supervised methods usually generate localization results based on attention maps produced by classification networks. However, the attention maps exhibit the most discriminative parts of the object which are small and sparse. We propose to generate Self-produced Guidance (SPG) masks which separate the foreground, the object of interest, from the background to provide the classification networks with spatial correlation information of pixels. A stagewise approach is proposed to incorporate high confident object regions to learn the SPG masks. The high confident regions within attention maps are utilized to progressively learn the SPG masks. The masks are then used as an auxiliary pixel-level supervision to facilitate the training of classification networks. Extensive experiments on ILSVRC demonstrate that SPG is effective in producing high-quality object localizations maps. Particularly, the proposed SPG achieves the Top-1 localization error rate of 43.83% on the ILSVRC validation set, which is a new state-of-the-art error rate.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| weakly-supervised-object-localization-on | SPG | Top-1 Error Rate: 51.40 |
| weakly-supervised-object-localization-on-1 | SPG | Top-5 Error: 40.00 |
| weakly-supervised-object-localization-on-cub | SPG | MaxBoxAccV2: 60.4 Top-1 Error Rate: 53.36 Top-5 Error: 42.28 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.