HyperAI
HyperAI
Home
News
Latest Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
English
HyperAI
HyperAI
Toggle sidebar
Search the site…
⌘
K
Home
SOTA
Semantic Segmentation
Semantic Segmentation On Cityscapes Val
Semantic Segmentation On Cityscapes Val
Metrics
mIoU
Results
Performance results of various models on this benchmark
Columns
Model Name
mIoU
Paper Title
Repository
DCT-EDANet
61.6
Exploring Semantic Segmentation on the DCT Representation
-
PatchDiverse + Swin-L (multi-scale test, upernet, ImageNet22k pretrain)
83.6%
Vision Transformers with Patch Diversification
-
DetCon_B
77.0%
Efficient Visual Pretraining with Contrastive Detection
-
StreamDEQ (8 iterations)
78.2
Representation Recycling for Streaming Video Analysis
-
SETR-PUP (80k, MS)
82.15
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
-
Soft Labells (HRnet)
84.8
Soft labelling for semantic segmentation: Bringing coherence to label down-sampling
-
FasterSeg
73.1%
FasterSeg: Searching for Faster Real-time Semantic Segmentation
-
HRNetV2 (HRNetV2-W40)
80.2
Deep High-Resolution Representation Learning for Visual Recognition
-
StreamDEQ (2 iterations)
57.9
Representation Recycling for Streaming Video Analysis
-
Dilated-ResNet (Dilated-ResNet-101)
75.7
Deep Residual Learning for Image Recognition
-
VPNeXt
84.4
VPNeXt -- Rethinking Dense Decoding for Plain Vision Transformer
-
GSCNN (ResNet-50)
73.0%
Gated-SCNN: Gated Shape CNNs for Semantic Segmentation
-
FAN-L-Hybrid
82.3
Understanding The Robustness in Vision Transformers
-
EfficientViT-B3 (r1184x2368)
83.2
EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction
-
Trans4Trans
81.54%
Trans4Trans: Efficient Transformer for Transparent Object and Semantic Scene Segmentation in Real-World Navigation Assistance
-
Aerial-PASS (ResNet-18)
72.8%
Aerial-PASS: Panoramic Annular Scene Segmentation in Drone Videos
-
DSNet(single-scale)
80.4
DSNet: A Novel Way to Use Atrous Convolutions in Semantic Segmentation
-
RepVGG-B2
80.57%
RepVGG: Making VGG-style ConvNets Great Again
-
InternImage-H
87
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
-
ViT-Adapter-L
85.8
Vision Transformer Adapter for Dense Predictions
-
0 of 97 row(s) selected.
Previous
Next
Semantic Segmentation On Cityscapes Val | SOTA | HyperAI