Image Classification On Inaturalist 2018

Metrics

Top-1 Accuracy

Results

Performance results of various models on this benchmark

		Paper Title	Repository
OmniVec2	94.6	OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning	-
OmniVec	93.8	OmniVec: Learning robust representations with cross modal sharing	-
InternImage-H	92.6%	InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
MAWS (ViT-2B)	91.3%	The effectiveness of MAE pre-pretraining for billion-scale pretraining
MetaFormer (MetaFormer-2,384,extra_info)	88.7%	MetaFormer: A Unified Meta Framework for Fine-Grained Recognition
Hiera-H (448px)	87.3%	Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
MAE (ViT-H, 448)	86.8%	Masked Autoencoders Are Scalable Vision Learners
SWAG (ViT H/14)	86.0%	Revisiting Weakly Supervised Pre-Training of Visual Perception Models
SEER (RegNet10B - finetuned - 384px)	84.7%	Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
MetaFormer (MetaFormer-2,384)	84.3%	MetaFormer: A Unified Meta Framework for Fine-Grained Recognition
OMNIVORE (Swin-L)	84.1%	Omnivore: A Single Model for Many Visual Modalities
RDNet-L (224 res, IN-1K pretrained)	81.8%	DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
RegNet-8GF	81.2%	Grafit: Learning fine-grained image representations with coarse labels	-
VL-LTR (ViT-B-16)	81.0%	VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition
µ2Net+ (ViT-L/16)	80.97	A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems
RDNet-B (224 res, IN-1K pretrained)	80.5	DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
MixMIM-L	80.3%	MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
DeiT-B	79.5%	Training data-efficient image transformers & distillation through attention
CeiT-S (384 finetune resolution)	79.4%	Incorporating Convolution Designs into Visual Transformers
RDNet-S (224 res, IN-1K pretrained)	79.1	DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs

0 of 60 row(s) selected.

Command Palette

Image Classification On Inaturalist 2018

Metrics

Results