HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Res-VMamba: Fine-Grained Food Category Visual Classification Using Selective State Space Models with Deep Residual Learning

Chen Chi-Sheng ; Chen Guan-Ying ; Zhou Dong ; Jiang Di ; Chen Dai-Shi

Res-VMamba: Fine-Grained Food Category Visual Classification Using
  Selective State Space Models with Deep Residual Learning

Abstract

Food classification is the foundation for developing food vision tasks andplays a key role in the burgeoning field of computational nutrition. Due to thecomplexity of food requiring fine-grained classification, recent academicresearch mainly modifies Convolutional Neural Networks (CNNs) and/or VisionTransformers (ViTs) to perform food category classification. However, to learnfine-grained features, the CNN backbone needs additional structural design,whereas ViT, containing the self-attention module, has increased computationalcomplexity. In recent months, a new Sequence State Space (S4) model, through aSelection mechanism and computation with a Scan (S6), colloquially termedMamba, has demonstrated superior performance and computation efficiencycompared to the Transformer architecture. The VMamba model, which incorporatesthe Mamba mechanism into image tasks (such as classification), currentlyestablishes the state-of-the-art (SOTA) on the ImageNet dataset. In thisresearch, we introduce an academically underestimated food dataset CNFOOD-241,and pioneer the integration of a residual learning framework within the VMambamodel to concurrently harness both global and local state features inherent inthe original VMamba architectural design. The research results show that VMambasurpasses current SOTA models in fine-grained and food classification. Theproposed Res-VMamba further improves the classification accuracy to 79.54\%without pretrained weight. Our findings elucidate that our proposed methodologyestablishes a new benchmark for SOTA performance in food recognition on theCNFOOD-241 dataset. The code can be obtained on GitHub:https://github.com/ChiShengChen/ResVMamba.

Code Repositories

chishengchen/resvmamba
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
fine-grained-image-recognition-on-cnfood-241Res-VMamba-S
Top-1 accuracy: 79.54
fine-grained-image-recognition-on-cnfood-241VMamba-S
Top-1 accuracy: 79.17

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp