HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

An Attention-Locating Algorithm for Eliminating Background Effects in Fine-grained Visual Classification

{Sam Kwong Zhengguo Li Mingliang Zhou Zhenzhe Hechen Yueting Huang}

Abstract

Fine-grained visual classification (FGVC) is a challenging task characterized by interclass similarity and intraclass diversity and has broad application prospects. Recently, several methods have adopted the vision Transformer (ViT) in FGVC tasks since the data specificity of the multihead self-attention (MSA) mechanism in ViT is beneficial for extracting discriminative feature representations. However, these works focus on integrating feature dependencies at a high level, which leads to the model being easily disturbed by low-level background information. To address this issue, we propose a fine-grained attention-locating vision Transformer (FAL-ViT) and an attention selection module (ASM). First, FAL-ViT contains a two-stage framework to identify crucial regions effectively within images and enhance features by strategically reusing parameters. Second, the ASM accurately locates important target regions via the natural scores of the MSA, extracting finer low-level features to offer more comprehensive information through position mapping. Extensive experiments on public datasets demonstrate that FAL-ViT outperforms the other methods in terms of performance, confirming the effectiveness of our proposed methods. The source code is available at https://github.com/Yueting-Huang/FAL-ViT.

Benchmarks

BenchmarkMethodologyMetrics
fine-grained-image-classification-on-cub-200FAL-ViT
Accuracy: 91.7%
fine-grained-image-classification-on-nabirdsFAL-ViT
Accuracy: 91.1%
fine-grained-image-classification-on-stanford-1FAL-ViT
Accuracy: 91.1%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
An Attention-Locating Algorithm for Eliminating Background Effects in Fine-grained Visual Classification | Papers | HyperAI