HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
首页
SOTA
自监督动作识别
Self Supervised Action Recognition On Hmdb51
Self Supervised Action Recognition On Hmdb51
评估指标
Frozen
Pre-Training Dataset
Top-1 Accuracy
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
Frozen
Pre-Training Dataset
Top-1 Accuracy
Paper Title
Repository
MVD (ViT-B)
false
Kinetics400
79.7
Masked Video Distillation: Rethinking Masked Feature Modeling for Self-supervised Video Representation Learning
M3Video
false
Kinetics400
78.0
Masked Motion Encoding for Self-Supervised Video Representation Learning
pBYOL
false
Kinetics400
75.0
A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning
SCE (R3D-50)
false
Kinetics400
74.7
Similarity Contrastive Estimation for Image and Video Soft Contrastive Self-Supervised Learning
VideoMAE
false
Kinetics400
73.3
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
BraVe:V-FA (TSM-50x2)
false
-
70.5
Broaden Your Views for Self-Supervised Video Learning
CVRL (R3D-152 2x; K600)
false
Kinetics600
69.9
Spatiotemporal Contrastive Video Representation Learning
XKD (ViT-B/112/16)
-
-
69
XKD: Cross-modal Knowledge Distillation with Domain Alignment for Video Representation Learning
XDC
false
IG-Kinetics
68.9
Self-Supervised Learning by Cross-Modal Audio-Video Clustering
CVRL (R3D-50; K600)
false
Kinetics600
68.0
Spatiotemporal Contrastive Video Representation Learning
CrissCross (AudioSet)
false
AudioSet
66.8
Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity
CVRL (R3D-50; K400)
false
Kinetics400
66.7
Spatiotemporal Contrastive Video Representation Learning
XDC
false
IG-Random
66.5
Self-Supervised Learning by Cross-Modal Audio-Video Clustering
XKD-Modality-Agnostic (ViT-B/112/16)
-
-
65.9
XKD: Cross-modal Knowledge Distillation with Domain Alignment for Video Representation Learning
VideoMS (ViT-B)
false
no extra data
65.8
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens
RSPNet
false
Kinetics400
64.7
RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning
CrissCross (Kinetics400)
false
Kinetics400
64.7
Self-Supervised Audio-Visual Representation Learning with Relaxed Cross-Modal Synchronicity
AVID+CMA (Modified R2+1D-18 on Audioset)
false
Audioset (Video+Audio)
64.7
Audio-Visual Instance Discrimination with Cross-Modal Agreement
ELo
false
-
64.5
Evolving Losses for Unsupervised Video Representation Learning
-
AVID (Modified R2+1D-18 on Audioset)
false
Audioset (Video+Audio)
64.1
Audio-Visual Instance Discrimination with Cross-Modal Agreement
0 of 48 row(s) selected.
Previous
Next
Self Supervised Action Recognition On Hmdb51 | SOTA | HyperAI超神经