HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Fine-Grained Visual Classification with Efficient End-to-end Localization

Harald Hanselmann Hermann Ney

Fine-Grained Visual Classification with Efficient End-to-end Localization

Abstract

The term fine-grained visual classification (FGVC) refers to classification tasks where the classes are very similar and the classification model needs to be able to find subtle differences to make the correct prediction. State-of-the-art approaches often include a localization step designed to help a classification network by localizing the relevant parts of the input images. However, this usually requires multiple iterations or passes through a full classification network or complex training schedules. In this work we present an efficient localization module that can be fused with a classification network in an end-to-end setup. On the one hand the module is trained by the gradient flowing back from the classification network. On the other hand, two self-supervised loss functions are introduced to increase the localization accuracy. We evaluate the new model on the three benchmark datasets CUB200-2011, Stanford Cars and FGVC-Aircraft and are able to achieve competitive recognition performance.

Benchmarks

BenchmarkMethodologyMetrics
fine-grained-image-classification-on-cub-200AttNet & AffNet
Accuracy: 88.9%
fine-grained-image-classification-on-fgvcAttNet & AffNet
Accuracy: 94.1%
fine-grained-image-classification-on-stanfordAttNet & AffNet
Accuracy: 95.6%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Fine-Grained Visual Classification with Efficient End-to-end Localization | Papers | HyperAI