HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
首页
SOTA
自监督图像分类
Self Supervised Image Classification On 1
Self Supervised Image Classification On 1
评估指标
Number of Params
Top 1 Accuracy
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
Number of Params
Top 1 Accuracy
Paper Title
Repository
DINOv2 (ViT-g/14, 448)
1100M
88.9%
DINOv2: Learning Robust Visual Features without Supervision
PercMAE (ViT-L, dVAE)
307M
88.6%
Improving Visual Representation Learning through Perceptual Understanding
DINOv2 (ViT-g/14)
1100M
88.5%
DINOv2: Learning Robust Visual Features without Supervision
PeCo(ViT-H/14, 448)
632M
88.3%
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
PercMAE (ViT-L)
307M
88.1%
Improving Visual Representation Learning through Perceptual Understanding
dBOT (ViT-H/14)
632M
88.0%
Exploring Target Representations for Masked Autoencoders
MAE (ViT-H/14, 448)
632M
87.8%
Masked Autoencoders Are Scalable Vision Learners
iBOT(ViT-L/16, 512)
307M
87.8%
iBOT: Image BERT Pre-Training with Online Tokenizer
MAE + AugSub finetune (ViT-H/14)
632M
87.2%
Masking meets Supervision: A Strong Learning Alliance
SimMIM (SwinV2-H, 512)
658M
87.1%
SimMIM: A Simple Framework for Masked Image Modeling
MAE (ViT-H/14)
-
86.9%
Masked Autoencoders Are Scalable Vision Learners
iBOT(ViT-L/16)
307M
86.6%
iBOT: Image BERT Pre-Training with Online Tokenizer
TEC_MAE (ViT-L/16, 224)
-
86.5%
Towards Sustainable Self-supervised Learning
BEiT-L (ViT)
307M
86.3%
BEiT: BERT Pre-Training of Image Transformers
CAE (ViT-L/16)
307M
86.3%
Context Autoencoder for Self-Supervised Representation Learning
MIRL (ViT-B-48)
341M
86.2%
Masked Image Residual Learning for Scaling Deeper Vision Transformers
MAE + AugSub finetune (ViT-L/16)
304M
86.1%
Masking meets Supervision: A Strong Learning Alliance
SparK (ConvNeXt-Large, 384)
198M
86.0%
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
BootMAE(ViT-L)
307M
85.9%
Bootstrapped Masked Autoencoders for Vision BERT Pretraining
SEER (Regnet10B)
10000M
85.8%
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
0 of 65 row(s) selected.
Previous
Next