HyperAI超神经
首页
资讯
最新论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
首页
SOTA
Video Object Detection
Video Object Detection On Imagenet Vid
Video Object Detection On Imagenet Vid
评估指标
MAP
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
MAP
Paper Title
Repository
YOLOV
87.5
YOLOV: Making Still Image Object Detectors Great at Video Object Detection
SELSA (ResNet-101)
82.69
Sequence Level Semantics Aggregation for Video Object Detection
REPP + SELSA (ResNet-101)
84.2
Robust and Efficient Post-Processing for Video Object Detection (REPP)
BoxMask (ResNet-50)
80.7
BoxMask: Revisiting Bounding Box Supervision for Video Object Detection
-
SELSA (ResNeXt-101)
84.3
Sequence Level Semantics Aggregation for Video Object Detection
YOLOV++
93.2
Practical Video Object Detection via Feature Selection and Aggregation
Ours (Faster RCNN + R101)
87.2
Objects do not disappear: Video object detection by single-frame object location anticipation
Ours (Def. DETR + SwinB)
91.3
Objects do not disappear: Video object detection by single-frame object location anticipation
DiffusionVID (ResNet-101)
87.1
DiffusionVID: Denoising Object Boxes with Spatio-temporal Conditioning for Video Object Detection
SparseVOD (ResNet-50)
80.3
Spatio-Temporal Learnable Proposals for End-to-End Video Object Detection
-
Online TSM
76.3
TSM: Temporal Shift Module for Efficient Video Understanding
DiffusionVID (Swin-B)
92.5
DiffusionVID: Denoising Object Boxes with Spatio-temporal Conditioning for Video Object Detection
LSTS (ResNet-101)
81.7
Learning Where to Focus for Efficient Video Object Detection
REPP + YOLOv3
75.1
Robust and Efficient Post-Processing for Video Object Detection (REPP)
Tracklet-Conditioned Detection+DCNv2+FGFA
83.5
Integrated Object Detection and Tracking with Tracklet-Conditioned Detection
-
TransVOD (Swin Base)
90.1
TransVOD: End-to-End Video Object Detection with Spatial-Temporal Transformers
ClipVID
85.8
Identity-Consistent Aggregation for Video Object Detection
MEGA (ResNeXt101)
85.4
Memory Enhanced Global-Local Aggregation for Video Object Detection
PTSEFormer (ResNet-101)
88.1
PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection
Looking Fast and Slow
63.9
Looking Fast and Slow: Memory-Guided Mobile Video Object Detection
0 of 33 row(s) selected.
Previous
Next