HyperAIHyperAI

Semantic Segmentation On Ade20K Val

Metrics

mIoU

Results

Performance results of various models on this benchmark

Model Name
mIoU
Paper TitleRepository
SeMask (SeMask Swin-L FaPN-Mask2Former)58.2SeMask: Semantically Masked Transformers for Semantic Segmentation-
Auto-DeepLab-L43.98Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation-
BEiT-L (ViT+UperNet, ImageNet-22k pretrain)57.0BEiT: BERT Pre-Training of Image Transformers-
Swin-L (UperNet, ImageNet-22k pretrain)53.5Swin Transformer: Hierarchical Vision Transformer using Shifted Windows-
EVA61.5EVA: Exploring the Limits of Masked Visual Representation Learning at Scale-
MixMIM-B50.3MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers-
SeMask (SeMask Swin-L MSFaPN-Mask2Former, single-scale)57.0SeMask: Semantically Masked Transformers for Semantic Segmentation-
PatchConvNet-B120 (UperNet)52.8Augmenting Convolutional networks with attention-based aggregation-
Mask2Former (Swin-L-FaPN, multiscale)57.7Masked-attention Mask Transformer for Universal Image Segmentation-
Twins-SVT-L (UperNet, ImageNet-1k pretrain)50.2Twins: Revisiting the Design of Spatial Attention in Vision Transformers-
DNL45.97Disentangled Non-Local Neural Networks-
DPT-Hybrid49.02Vision Transformers for Dense Prediction-
DCNAS47.12DCNAS: Densely Connected Neural Architecture Search for Semantic Image Segmentation-
ACNet (ResNet-101)45.90Adaptive Context Network for Scene Parsing-
Swin-S (RPE w/ GAB)46.41Understanding Gaussian Attention Bias of Vision Transformers Using Effective Receptive Fields-
Mask2Former (Swin-L-FaPN)56.4Masked-attention Mask Transformer for Universal Image Segmentation-
OneFormer (InternImage-H, emb_dim=256, multi-scale, 896x896)60.8OneFormer: One Transformer to Rule Universal Image Segmentation-
ViT-Adapter-L (UperNet, BEiT pretrain)58.4Vision Transformer Adapter for Dense Predictions-
Light-Ham (VAN-Large, 46M, IN-1k, MS)51.0Is Attention Better Than Matrix Decomposition?-
OCR (ResNet-101)45.28Segmentation Transformer: Object-Contextual Representations for Semantic Segmentation-
0 of 94 row(s) selected.
Semantic Segmentation On Ade20K Val | SOTA | HyperAI