HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

MetaFormer: A Unified Meta Framework for Fine-Grained Recognition

Qishuai Diao Yi Jiang Bin Wen Jia Sun Zehuan Yuan

MetaFormer: A Unified Meta Framework for Fine-Grained Recognition

Abstract

Fine-Grained Visual Classification(FGVC) is the task that requires recognizing the objects belonging to multiple subordinate categories of a super-category. Recent state-of-the-art methods usually design sophisticated learning pipelines to tackle this task. However, visual information alone is often not sufficient to accurately differentiate between fine-grained visual categories. Nowadays, the meta-information (e.g., spatio-temporal prior, attribute, and text description) usually appears along with the images. This inspires us to ask the question: Is it possible to use a unified and simple framework to utilize various meta-information to assist in fine-grained identification? To answer this problem, we explore a unified and strong meta-framework(MetaFormer) for fine-grained visual classification. In practice, MetaFormer provides a simple yet effective approach to address the joint learning of vision and various meta-information. Moreover, MetaFormer also provides a strong baseline for FGVC without bells and whistles. Extensive experiments demonstrate that MetaFormer can effectively use various meta-information to improve the performance of fine-grained recognition. In a fair comparison, MetaFormer can outperform the current SotA approaches with only vision information on the iNaturalist2017 and iNaturalist2018 datasets. Adding meta-information, MetaFormer can exceed the current SotA approaches by 5.9% and 5.3%, respectively. Moreover, MetaFormer can achieve 92.3% and 92.7% on CUB-200-2011 and NABirds, which significantly outperforms the SotA approaches. The source code and pre-trained models are released athttps://github.com/dqshuai/MetaFormer.

Code Repositories

salluru007/papers
Mentioned in GitHub
dqshuai/metaformer
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
fine-grained-image-classification-on-cub-200MetaFormer (MetaFormer-2,384)
Accuracy: 92.9%
fine-grained-image-classification-on-nabirdsMetaFormer (MetaFormer-2,384)
Accuracy: 93.0%
image-classification-on-inaturalistMetaFormer (MetaFormer-2,384,extra_info)
Top 1 Accuracy: 83.4%
image-classification-on-inaturalistMetaFormer (MetaFormer-2,384)
Top 1 Accuracy: 80.4%
image-classification-on-inaturalist-2018MetaFormer (MetaFormer-2,384)
Top-1 Accuracy: 84.3%
image-classification-on-inaturalist-2018MetaFormer (MetaFormer-2,384,extra_info)
Top-1 Accuracy: 88.7%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp