MAE (ViT-B/16, 224x224, SSL+FT) | 60.2 | 61.0 | Masked Autoencoders Are Scalable Vision Learners | |
SERE (ViT-B/16, 100ep, 224x224, SSL) | 48.2 | 48.6 | SERE: Exploring Feature Self-relation for Self-supervised Transformer | |
RF-ConvNext-Tiny (rfmerge, P4, 224x224, SUP) | 51.1 | 51.3 | RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks | |
MAE (ViT-B/16, 224x224, SSL) | 37.0 | 38.3 | Masked Autoencoders Are Scalable Vision Learners | |
ConvNext-Tiny (P4, 224x224, SUP) | 48.8 | 48.7 | A ConvNet for the 2020s | |
TEC (ViT-B/16, 224x224, SSL+FT) | - | 62.0 | Towards Sustainable Self-supervised Learning | |
MAE (ViT-B/16, 224x224, SSL, mmseg) | 40.3 | 40.0 | Masked Autoencoders Are Scalable Vision Learners | |
SERE (ViT-S/16, 100ep, 224x224, SSL) | 40.2 | 41.0 | SERE: Exploring Feature Self-relation for Self-supervised Transformer | |
SERE (ViT-S/16, 100ep, 224x224, SSL+FT, mmseg) | 59.0 | 59.4 | SERE: Exploring Feature Self-relation for Self-supervised Transformer | |
RF-ConvNext-Tiny (rfmultiple, P4, 224x224, SUP) | 50.5 | 50.8 | RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks | |
RF-ConvNext-Tiny (rfsingle, P4, 224x224, SUP) | 50.5 | 50.7 | RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks | |
MAE (ViT-B/16, 224x224, SSL+FT, mmseg) | 61.2 | 61.6 | Masked Autoencoders Are Scalable Vision Learners | |
PASS (ResNet-50 D16, 224x224, LUSS) | 20.8 | 21.6 | Large-scale Unsupervised Semantic Segmentation | |
SERE (ViT-S/16, 100ep, 224x224, SSL, mmseg) | 40.5 | 41.0 | SERE: Exploring Feature Self-relation for Self-supervised Transformer | |
SERE (ViT-B/16, 100ep, 224x224, SSL+FT) | 63.3 | 63.0 | SERE: Exploring Feature Self-relation for Self-supervised Transformer | |
TEC (ViT-B/16, 224x224, SSL, mmseg) | 46.0 | 46.1 | Towards Sustainable Self-supervised Learning | |
SERE (ViT-S/16, 100ep, 224x224, SSL+FT) | 57.8 | 58.9 | SERE: Exploring Feature Self-relation for Self-supervised Transformer | |
TEC (ViT-B/16, 224x224, SSL) | - | 42.9 | Towards Sustainable Self-supervised Learning | |
TEC (ViT-B/16, 224x224, SSL+FT, mmseg) | 62.5 | 63.2 | Towards Sustainable Self-supervised Learning | |
PASS (ResNet-50 D32, 224x224, LUSS) | 20.3 | 21.0 | Large-scale Unsupervised Semantic Segmentation | |