HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Zhou Mohan ; Bai Yalong ; Zhang Wei ; Zhao Tiejun ; Mei Tao

Look-into-Object: Self-supervised Structure Modeling for Object
  Recognition

Abstract

Most object recognition approaches predominantly focus on learningdiscriminative visual patterns while overlooking the holistic object structure.Though important, structure modeling usually requires significant manualannotations and therefore is labor-intensive. In this paper, we propose to"look into object" (explicitly yet intrinsically model the object structure)through incorporating self-supervisions into the traditional framework. We showthe recognition backbone can be substantially enhanced for more robustrepresentation learning, without any cost of extra annotation and inferencespeed. Specifically, we first propose an object-extent learning module forlocalizing the object according to the visual patterns shared among theinstances in the same category. We then design a spatial context learningmodule for modeling the internal structures of the object, through predictingthe relative positions within the extent. These two modules can be easilyplugged into any backbone networks during training and detached at inferencetime. Extensive experiments show that our look-into-object approach (LIO)achieves large performance gain on a number of benchmarks, including genericobject recognition (ImageNet) and fine-grained object recognition tasks (CUB,Cars, Aircraft). We also show that this learning paradigm is highlygeneralizable to other tasks such as object detection and segmentation (MSCOCO). Project page: https://github.com/JDAI-CV/LIO.

Code Repositories

JDAI-CV/LIO
Official
pytorch
JDAI-CV/DCL
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
fine-grained-image-classification-on-cub-200-1LIO
Accuracy: 88.0
fine-grained-image-classification-on-fgvcLIO/ResNet-50 (multi-stage)
Accuracy: 92.7%
fine-grained-image-classification-on-stanfordLIO/ResNet-50 (multi-stage)
Accuracy: 94.5%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp