HyperAI

Semantic Segmentation On Cityscapes Val

Metrics

mIoU

Results

Performance results of various models on this benchmark

Model Name
mIoU
Paper TitleRepository
DCT-EDANet61.6Exploring Semantic Segmentation on the DCT Representation-
PatchDiverse + Swin-L (multi-scale test, upernet, ImageNet22k pretrain)83.6%Vision Transformers with Patch Diversification
DetCon_B77.0%Efficient Visual Pretraining with Contrastive Detection
StreamDEQ (8 iterations)78.2Representation Recycling for Streaming Video Analysis
SETR-PUP (80k, MS)82.15Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Soft Labells (HRnet)84.8Soft labelling for semantic segmentation: Bringing coherence to label down-sampling
FasterSeg73.1%FasterSeg: Searching for Faster Real-time Semantic Segmentation
HRNetV2 (HRNetV2-W40)80.2Deep High-Resolution Representation Learning for Visual Recognition
StreamDEQ (2 iterations)57.9Representation Recycling for Streaming Video Analysis
Dilated-ResNet (Dilated-ResNet-101)75.7Deep Residual Learning for Image Recognition
VPNeXt84.4VPNeXt -- Rethinking Dense Decoding for Plain Vision Transformer-
GSCNN (ResNet-50)73.0%Gated-SCNN: Gated Shape CNNs for Semantic Segmentation
FAN-L-Hybrid82.3Understanding The Robustness in Vision Transformers
EfficientViT-B3 (r1184x2368)83.2EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction
Trans4Trans81.54%Trans4Trans: Efficient Transformer for Transparent Object and Semantic Scene Segmentation in Real-World Navigation Assistance
Aerial-PASS (ResNet-18)72.8%Aerial-PASS: Panoramic Annular Scene Segmentation in Drone Videos-
DSNet(single-scale)80.4DSNet: A Novel Way to Use Atrous Convolutions in Semantic Segmentation
RepVGG-B280.57%RepVGG: Making VGG-style ConvNets Great Again
InternImage-H87InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
ViT-Adapter-L85.8Vision Transformer Adapter for Dense Predictions
0 of 97 row(s) selected.