ML Research Wiki / Benchmarks / Speaker Identification / VoxCeleb1

VoxCeleb1

Speaker Identification Benchmark

Performance Over Time

📊 Showing 12 results | 📏 Metric: Top-1 (%)

Top Performing Models

Rank Model Paper Top-1 (%) Date Code
1 MSM-MAE 📚 Masked Modeling Duo: Towards a Universal Audio Pre-training Framework 96.60 2024-04-09 📦 nttcslab/m2d 📦 nttcslab/eval-audio-repr
2 M2D/0.6 📚 Masked Modeling Duo: Towards a Universal Audio Pre-training Framework 96.50 2024-04-09 📦 nttcslab/m2d 📦 nttcslab/eval-audio-repr
3 M2D/0.7 📚 Masked Modeling Duo: Towards a Universal Audio Pre-training Framework 96.30 2024-04-09 📦 nttcslab/m2d 📦 nttcslab/eval-audio-repr
4 AudioMAE (local) 📚 Masked Autoencoders that Listen 94.80 2022-07-13 📦 facebookresearch/multimodal 📦 facebookresearch/audiomae 📦 rishikksh20/AudioMAE-pytorch 📦 eml-eda/tle-supervised
5 M2D ratio=0.6 📚 Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input 94.80 2022-10-26 📦 nttcslab/m2d
6 ATST Base (ours) 📚 ATST: Audio Representation Learning with Teacher-Student Transformer 94.30 2022-04-26 📦 Audio-WestlakeU/ATST-SED 📦 Audio-WestlakeU/audiossl 📦 2024-MindSpore-1/Code6 📦 2023-MindSpore-4/Code8
7 AudioMAE (global) 📚 Masked Autoencoders that Listen 94.10 2022-07-13 📦 facebookresearch/multimodal 📦 facebookresearch/audiomae 📦 rishikksh20/AudioMAE-pytorch 📦 eml-eda/tle-supervised
8 AutoSpeech (N=8,C=128) 📚 AutoSpeech: Neural Architecture Search for Speaker Recognition 87.66 2020-05-07 📦 VITA-Group/AutoSpeech 📦 TAMU-VITA/AutoSpeech 📦 JeongwookUm/TEST_AutoSpeech-master
9 SSAST-FRAME 📚 SSAST: Self-Supervised Audio Spectrogram Transformer 80.80 2021-10-19 📦 YuanGongND/ast 📦 YuanGongND/ssast 📦 siavashshams/ssamba
10 SSAMBA 📚 SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model 70.10 2024-05-20 📦 siavashshams/ssamba

All Papers (12)