HyperAI
Home
News
Latest Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
English
HyperAI
Toggle sidebar
Search the site…
⌘
K
Home
SOTA
Semantic Segmentation
Semantic Segmentation On Cityscapes Val
Semantic Segmentation On Cityscapes Val
Metrics
mIoU
Results
Performance results of various models on this benchmark
Columns
Model Name
mIoU
Paper Title
Repository
DCT-EDANet
61.6
Exploring Semantic Segmentation on the DCT Representation
-
PatchDiverse + Swin-L (multi-scale test, upernet, ImageNet22k pretrain)
83.6%
Vision Transformers with Patch Diversification
DetCon_B
77.0%
Efficient Visual Pretraining with Contrastive Detection
StreamDEQ (8 iterations)
78.2
Representation Recycling for Streaming Video Analysis
SETR-PUP (80k, MS)
82.15
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Soft Labells (HRnet)
84.8
Soft labelling for semantic segmentation: Bringing coherence to label down-sampling
FasterSeg
73.1%
FasterSeg: Searching for Faster Real-time Semantic Segmentation
HRNetV2 (HRNetV2-W40)
80.2
Deep High-Resolution Representation Learning for Visual Recognition
StreamDEQ (2 iterations)
57.9
Representation Recycling for Streaming Video Analysis
Dilated-ResNet (Dilated-ResNet-101)
75.7
Deep Residual Learning for Image Recognition
VPNeXt
84.4
VPNeXt -- Rethinking Dense Decoding for Plain Vision Transformer
-
GSCNN (ResNet-50)
73.0%
Gated-SCNN: Gated Shape CNNs for Semantic Segmentation
FAN-L-Hybrid
82.3
Understanding The Robustness in Vision Transformers
EfficientViT-B3 (r1184x2368)
83.2
EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction
Trans4Trans
81.54%
Trans4Trans: Efficient Transformer for Transparent Object and Semantic Scene Segmentation in Real-World Navigation Assistance
Aerial-PASS (ResNet-18)
72.8%
Aerial-PASS: Panoramic Annular Scene Segmentation in Drone Videos
-
DSNet(single-scale)
80.4
DSNet: A Novel Way to Use Atrous Convolutions in Semantic Segmentation
RepVGG-B2
80.57%
RepVGG: Making VGG-style ConvNets Great Again
InternImage-H
87
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
ViT-Adapter-L
85.8
Vision Transformer Adapter for Dense Predictions
0 of 97 row(s) selected.
Previous
Next