HyperAIHyperAI

Action Classification On Kinetics 700

Metrics

Top-1 Accuracy
Top-5 Accuracy

Results

Performance results of various models on this benchmark

Model Name
Top-1 Accuracy
Top-5 Accuracy
Paper TitleRepository
SRTG r(2+1)d-3449.4373.23Learn to cycle: Time-consistent feature discovery for action recognition-
MViTv2-B76.693.2MViTv2: Improved Multiscale Vision Transformers for Classification and Detection-
SRTG r3d-5053.5274.17Learn to cycle: Time-consistent feature discovery for action recognition-
MoViNet-A163.5-MoViNets: Mobile Video Networks for Efficient Video Recognition-
MoViNet-A266.7-MoViNets: Mobile Video Networks for Efficient Video Recognition-
MoViNet-A368.0-MoViNets: Mobile Video Networks for Efficient Video Recognition-
VidTr-M69.588.3VidTr: Video Transformer Without Convolutions-
InternVideo-T84.0-InternVideo: General Video Foundation Models via Generative and Discriminative Learning-
EVA82.9%-EVA: Exploring the Limits of Masked Visual Representation Learning at Scale-
MoViNet-A470.7-MoViNets: Mobile Video Networks for Efficient Video Recognition-
UniFormerV2-L82.796.2UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
MaskFeat (no extra data, MViT-L)80.495.7Masked Feature Prediction for Self-Supervised Visual Pre-Training-
InternVideo2-1B85.4-InternVideo2: Scaling Foundation Models for Multimodal Video Understanding-
SRTG r3d-10156.4676.82Learn to cycle: Time-consistent feature discovery for action recognition-
VidTr-L70.289VidTr: Video Transformer Without Convolutions-
SRTG r3d-3449.1572.68Learn to cycle: Time-consistent feature discovery for action recognition-
mPLUG-280.494.9mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video-
UMT-L (ViT-L/16)83.696.7Unmasked Teacher: Towards Training-Efficient Video Foundation Models-
CoVeR (JFT-3B)79.894.9Co-training Transformer with Videos and Images Improves Action Recognition-
MViTv2-L (ImageNet-21k pretrain)79.494.9MViTv2: Improved Multiscale Vision Transformers for Classification and Detection-
0 of 36 row(s) selected.
Action Classification On Kinetics 700 | SOTA | HyperAI