HyperAIHyperAI

Instance Segmentation On Coco Minival

Metrics

APL
APM
APS
mask AP

Results

Performance results of various models on this benchmark

Model Name
APL
APM
APS
mask AP
Paper TitleRepository
Mask R-CNN (FPN, X-volution, SA)53.14019.237.2X-volution: On the unification of convolution and self-attention-
MViTv2-L (Cascade Mask R-CNN, multi-scale, IN21k pre-train)---50.5MViTv2: Improved Multiscale Vision Transformers for Classification and Detection-
XCiT-M24/8---43.7XCiT: Cross-Covariance Image Transformers-
InternImage-S---44.5InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions-
BoTNet 50 (72 epochs)---40.7Bottleneck Transformers for Visual Recognition-
QueryInst (single scale)68.352.630.848.9Instances as Queries-
Co-DETR74.659.738.956.6DETRs with Collaborative Hybrid Assignments Training-
EVA72.058.437.655.0EVA: Exploring the Limits of Masked Visual Representation Learning at Scale-
CBNetV2 (Dual-Swin-L HTC, multi-scale)---51.8CBNet: A Composite Backbone Network Architecture for Object Detection-
Faster R-CNN (Res2Net-50)53.737.915.735.6Res2Net: A New Multi-scale Backbone Architecture-
Swin-L (HTC++, multi scale)---50.4Swin Transformer: Hierarchical Vision Transformer using Shifted Windows-
ResNeSt-200 (multi-scale)---46.25ResNeSt: Split-Attention Networks-
ViT-Adapter-L (HTC++, BEiT pretrain, multi-scale)---52.2Vision Transformer Adapter for Dense Predictions-
ELSA-S (Cascade Mask RCNN)---44.4ELSA: Enhanced Local Self-Attention for Vision Transformer-
ViTDet, ViT-H Cascade---52Exploring Plain Vision Transformer Backbones for Object Detection-
Mask R-CNN (ResNet-50-FPN, GRoIE)48.73919.135.8A novel Region of Interest Extraction Layer for Instance Segmentation-
CenterNet2 (Swin-L w/ X-Paste + Copy-Paste)---48.8X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion-
SwinV2-G (HTC++)---53.7Swin Transformer V2: Scaling Up Capacity and Resolution-
MViT-L (Mask R-CNN, single-scale)---46.2MViTv2: Improved Multiscale Vision Transformers for Classification and Detection-
PANet (ResNet-50)---37.8Path Aggregation Network for Instance Segmentation-
0 of 93 row(s) selected.
Instance Segmentation On Coco Minival | SOTA | HyperAI