HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
首页
SOTA
实例分割
Instance Segmentation On Coco Minival
Instance Segmentation On Coco Minival
评估指标
APL
APM
APS
mask AP
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
APL
APM
APS
mask AP
Paper Title
Repository
Co-DETR
74.6
59.7
38.9
56.6
DETRs with Collaborative Hybrid Assignments Training
ViT-CoMer-L (Mask RCNN, DINOv2)
-
-
-
55.9
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions
-
InternImage-H
74.4
58.4
37.9
55.4
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
EVA
72.0
58.4
37.6
55.0
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Mask Frozen-DETR
72.9
58.4
37.2
54.9
Mask Frozen-DETR: High Quality Instance Segmentation with One GPU
-
MasK DINO (SwinL, multi-scale)
-
-
-
54.5
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
ViT-Adapter-L (HTC++, BEiTv2, O365, multi-scale)
-
-
-
54.2
Vision Transformer Adapter for Dense Predictions
GLEE-Pro
-
-
-
54.2
General Object Foundation Model for Images and Videos at Scale
SwinV2-G (HTC++)
-
-
-
53.7
Swin Transformer V2: Scaling Up Capacity and Resolution
ViTDet, ViT-H Cascade (multiscale)
-
-
-
53.1
Exploring Plain Vision Transformer Backbones for Object Detection
GLEE-Plus
-
-
-
53.0
General Object Foundation Model for Images and Videos at Scale
Mask DINO (SwinL)
-
-
-
52.6
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation
ViT-Adapter-L (HTC++, BEiTv2 pretrain, multi-scale)
-
-
-
52.5
Vision Transformer Adapter for Dense Predictions
Soft Teacher + Swin-L(HTC++, multi-scale)
-
-
-
52.5
End-to-End Semi-Supervised Object Detection with Soft Teacher
ViT-Adapter-L (HTC++, BEiT pretrain, multi-scale)
-
-
-
52.2
Vision Transformer Adapter for Dense Predictions
ViTDet, ViT-H Cascade
-
-
-
52
Exploring Plain Vision Transformer Backbones for Object Detection
Soft Teacher + Swin-L(HTC++, single-scale)
-
-
-
51.9
End-to-End Semi-Supervised Object Detection with Soft Teacher
CBNetV2 (Dual-Swin-L HTC, multi-scale)
-
-
-
51.8
CBNet: A Composite Backbone Network Architecture for Object Detection
Frozen Backbone, SwinV2-G-ext22K (HTC)
-
-
-
51.6
Could Giant Pretrained Image Models Extract Universal Representations?
-
CBNetV2 (Dual-Swin-L HTC, multi-scale)
-
-
-
51
CBNet: A Composite Backbone Network Architecture for Object Detection
0 of 93 row(s) selected.
Previous
Next
Instance Segmentation On Coco Minival | SOTA | HyperAI超神经