HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
首页
SOTA
图像分类
Image Classification On Inaturalist 2018
Image Classification On Inaturalist 2018
评估指标
Top-1 Accuracy
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
Top-1 Accuracy
Paper Title
Repository
OmniVec2
94.6
OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning
-
OmniVec
93.8
OmniVec: Learning robust representations with cross modal sharing
-
InternImage-H
92.6%
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
MAWS (ViT-2B)
91.3%
The effectiveness of MAE pre-pretraining for billion-scale pretraining
MetaFormer (MetaFormer-2,384,extra_info)
88.7%
MetaFormer: A Unified Meta Framework for Fine-Grained Recognition
Hiera-H (448px)
87.3%
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
MAE (ViT-H, 448)
86.8%
Masked Autoencoders Are Scalable Vision Learners
SWAG (ViT H/14)
86.0%
Revisiting Weakly Supervised Pre-Training of Visual Perception Models
SEER (RegNet10B - finetuned - 384px)
84.7%
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
MetaFormer (MetaFormer-2,384)
84.3%
MetaFormer: A Unified Meta Framework for Fine-Grained Recognition
OMNIVORE (Swin-L)
84.1%
Omnivore: A Single Model for Many Visual Modalities
RDNet-L (224 res, IN-1K pretrained)
81.8%
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
RegNet-8GF
81.2%
Grafit: Learning fine-grained image representations with coarse labels
-
VL-LTR (ViT-B-16)
81.0%
VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition
µ2Net+ (ViT-L/16)
80.97
A Continual Development Methodology for Large-scale Multitask Dynamic ML Systems
RDNet-B (224 res, IN-1K pretrained)
80.5
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
MixMIM-L
80.3%
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
DeiT-B
79.5%
Training data-efficient image transformers & distillation through attention
CeiT-S (384 finetune resolution)
79.4%
Incorporating Convolution Designs into Visual Transformers
RDNet-S (224 res, IN-1K pretrained)
79.1
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
0 of 60 row(s) selected.
Previous
Next
Image Classification On Inaturalist 2018 | SOTA | HyperAI超神经