HyperAI

Audio Classification On Esc 50

Metrics

Top-1 Accuracy

Results

Performance results of various models on this benchmark

Model Name
Top-1 Accuracy
Paper TitleRepository
AVID89.2Audio-Visual Instance Discrimination with Cross-Modal Agreement
mn40_as97.45Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation
M2D/0.796.0Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
M2D-CLAP/0.797.4M2D-CLAP: Masked Modeling Duo Meets CLAP for Learning General-purpose Audio-Language Representation
BEATs98.1BEATs: Audio Pre-Training with Acoustic Tokenizers
OmniVec299.1OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning-
LHGNN96.2LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and Tagging-
XDC84.8Self-Supervised Learning by Cross-Modal Audio-Video Clustering
ERANN-2-596.1ERANNs: Efficient Residual Audio Neural Networks for Audio Pattern Recognition-
ACDNet87.1Environmental Sound Classification on the Edge: A Pipeline for Deep Acoustic Networks on Extremely Resource-Constrained Devices
OmniVec98.4OmniVec: Learning robust representations with cross modal sharing-
Multi-Channel Audio Feature with CNN89.5--
Audio Spectrogram Transformer95.7AST: Audio Spectrogram Transformer
AVTS82.3Cooperative Learning of Audio and Video Models from Self-Supervised Synchronization-
SepTr + LeRaC91.58Learning Rate Curriculum
XDC85.4Self-Supervised Learning by Cross-Modal Audio-Video Clustering
L379.3Look, Listen and Learn
Multi-Format Contrastive90.5Multi-Format Contrastive Learning of Audio Representations-
InternVideo298.6InternVideo2: Scaling Foundation Models for Multimodal Video Understanding
DyMN-L97.4Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models
0 of 27 row(s) selected.