ML Research Wiki / Benchmarks / Multi-modal Classification / VGG-Sound

VGG-Sound

Multi-modal Classification Benchmark

Performance Over Time

📊 Showing 2 results | 📏 Metric: Top-1 Accuracy

Top Performing Models

Rank Model Paper Top-1 Accuracy Date Code
1 CAV-MAE (Audio-Visual) 📚 Contrastive Audio-Visual Masked Autoencoder 65.90 2022-10-02 📦 yuangongnd/cav-mae
2 UAVM 📚 UAVM: Towards Unifying Audio and Visual Models 65.80 2022-07-29 📦 YuanGongND/uavm

All Papers (2)