HyperAI
Home
News
Latest Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
English
HyperAI
Toggle sidebar
Search the site…
⌘
K
Home
SOTA
Image Classification
Image Classification On Imagenet Real
Image Classification On Imagenet Real
Metrics
Accuracy
Params
Results
Performance results of various models on this benchmark
Columns
Model Name
Accuracy
Params
Paper Title
Repository
BiT-L
90.54%
928M
Big Transfer (BiT): General Visual Representation Learning
MAWS (ViT-6.5B)
91.1%
-
The effectiveness of MAE pre-pretraining for billion-scale pretraining
ResMLP-36
85.6%
45M
ResMLP: Feedforward networks for image classification with data-efficient training
Assemble ResNet-50
87.82%
-
Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network
ResMLP-B24/8 (22k)
-
-
ResMLP: Feedforward networks for image classification with data-efficient training
BiT-M
89.02%
-
Big Transfer (BiT): General Visual Representation Learning
Model soups (ViT-G/14)
91.20%
1843M
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
CeiT-T
83.6%
-
Incorporating Convolution Designs into Visual Transformers
TokenLearner L/8 (24+11)
91.05%
460M
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Meta Pseudo Labels (EfficientNet-L2)
91.02%
-
Meta Pseudo Labels
ViTAE-H (MAE, 512)
91.2%
644M
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond
Model soups (BASIC-L)
91.03%
2440M
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
FixResNeXt-101 32x48d
89.73%
829M
Fixing the train-test resolution discrepancy
LeViT-384
87.5%
-
LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
ViT-L @384 (DeiT III, 21k)
-
-
DeiT III: Revenge of the ViT
VOLO-D5
90.6%
-
VOLO: Vision Outlooker for Visual Recognition
ResMLP-12
84.6%
15M
ResMLP: Feedforward networks for image classification with data-efficient training
NASNet-A Large
87.56%
-
Learning Transferable Architectures for Scalable Image Recognition
Assemble-ResNet152
88.65%
-
Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network
DeiT-Ti
82.1%
5M
Training data-efficient image transformers & distillation through attention
-
0 of 57 row(s) selected.
Previous
Next