HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
首页
SOTA
音频分类
Audio Classification On Audioset
Audio Classification On Audioset
评估指标
Test mAP
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
Test mAP
Paper Title
Repository
OmniVec2
0.558
OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning
-
OmniVec
0.548
OmniVec: Learning robust representations with cross modal sharing
-
EquiAV
0.546
EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
MAViL (Audio-Visual, single)
0.533
-
-
Audiovisual Masked Autoencoder (Audiovisual, Single)
0.518
Audiovisual Masked Autoencoders
CAV-MAE (Audio-Visual)
0.512
Contrastive Audio-Visual Masked Autoencoder
BEATs (Audio-only, Ensemble)
0.506
BEATs: Audio Pre-Training with Acoustic Tokenizers
UAVM (Audio + Video)
0.504
UAVM: Towards Unifying Audio and Visual Models
SSLAM (Audio-Only, Single)
0.502
SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes
-
mn40_as (Ensemble)
0.498
Efficient Large-scale Audio Tagging via Transformer-to-CNN Knowledge Distillation
ATST-C2F(Single)
0.497
Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks
MBT (AS-500K training + Video)
0.496
Attention Bottlenecks for Multimodal Fusion
PaSST (Ensemble)
0.496
Efficient Training of Audio Transformers with Patchout
DyMN-L (Audio-Only, Single)
0.490
Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models
HTS-AT (Ensemble)
0.487
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
EAT
0.486
EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
BEATs (Audio-only, Single)
0.486
BEATs: Audio Pre-Training with Acoustic Tokenizers
DTF-AT (Single)
0.486
DTF-AT: Decoupled Time-Frequency Audio Transformer for Event Classification
-
M2D-AS/0.7
0.485
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework
AST (Ensemble)
0.485
AST: Audio Spectrogram Transformer
0 of 50 row(s) selected.
Previous
Next
Audio Classification On Audioset | SOTA | HyperAI超神经