HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
首页
SOTA
语义分割
Semantic Segmentation On Ade20K Val
Semantic Segmentation On Ade20K Val
评估指标
mIoU
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
mIoU
Paper Title
Repository
BEiT-3
62.8
Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks
ViT-CoMer
62.1
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions
-
EVA
61.5
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
FD-SwinV2-G
61.4
Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation
OneFormer (InternImage-H, emb_dim=256, multi-scale, 896x896)
60.8
OneFormer: One Transformer to Rule Universal Image Segmentation
MaskDINO-SwinL
60.8
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
ViT-Adapter-L (Mask2Former, BEiT pretrain)
60.5
Vision Transformer Adapter for Dense Predictions
SERNet-Former_v2
59.35
SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks
OneFormer (DiNAT-L, multi-scale, 896x896)
58.6
OneFormer: One Transformer to Rule Universal Image Segmentation
ViT-Adapter-L (UperNet, BEiT pretrain)
58.4
Vision Transformer Adapter for Dense Predictions
RSSeg-ViT-L(BEiT pretrain)
58.4
Representation Separation for Semantic Segmentation with Vision Transformers
-
OneFormer (DiNAT-L, multi-scale, 640x640)
58.4
OneFormer: One Transformer to Rule Universal Image Segmentation
OneFormer (Swin-L, multi-scale, 896x896)
58.3
OneFormer: One Transformer to Rule Universal Image Segmentation
SeMask (SeMask Swin-L FaPN-Mask2Former)
58.2
SeMask: Semantically Masked Transformers for Semantic Segmentation
SeMask (SeMask Swin-L MSFaPN-Mask2Former)
58.2
SeMask: Semantically Masked Transformers for Semantic Segmentation
DiNAT-L (Mask2Former)
58.1
Dilated Neighborhood Attention Transformer
Mask2Former (Swin-L-FaPN, multiscale)
57.7
Masked-attention Mask Transformer for Universal Image Segmentation
OneFormer (Swin-L, multi-scale, 640x640)
57.7
OneFormer: One Transformer to Rule Universal Image Segmentation
SeMask (SeMask Swin-L Mask2Former)
57.5
SeMask: Semantically Masked Transformers for Semantic Segmentation
SenFormer (BEiT-L)
57.1
Efficient Self-Ensemble for Semantic Segmentation
0 of 94 row(s) selected.
Previous
Next