ML Research Wiki / Benchmarks / Text to Audio Retrieval / Clotho

Clotho

Text to Audio Retrieval Benchmark

Performance Over Time

📊 Showing 12 results | 📏 Metric: R@1

Top Performing Models

Rank Model Paper R@1 Date Code
1 PaSST-RoBERTa & Estimated Audio–Caption Correspondences 📚 Estimated Audio-Caption Correspondences Improve Language-Based Audio Retrieval 27.69 2024-08-21 📦 optimusprimus/salsa
2 InternVideo2-6B 📚 InternVideo2: Scaling Foundation Models for Multimodal Video Understanding 27.20 2024-03-22 📦 opengvlab/internvideo 📦 opengvlab/internvideo2
3 VAST 📚 VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset 26.90 2023-05-29 📦 TXH-mercury/VALOR 📦 txh-mercury/vast
4 PaSST–RoBERTa & GPT-augment 📚 Advancing Natural-Language Based Audio Retrieval with PaSST and Large Audio-Caption Data Sets 26.07 2023-08-08 📦 optimusprimus/dcase2023_task6b
5 ONE-PEACE 📚 ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities 22.40 2023-05-18 📦 modelscope/modelscope 📦 OFA-Sys/ONE-PEACE
6 VALOR 📚 VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset 17.50 2023-04-17 📦 TXH-mercury/VALOR
7 CE (pretraining:AudioCaps) Audio Retrieval with Natural Language Queries 0.00 2021-05-05 📦 oncescuandreea/audio-retrieval
8 MoEE (pretraining:AudioCaps) Audio Retrieval with Natural Language Queries 0.00 2021-05-05 📦 oncescuandreea/audio-retrieval
9 CE Audio Retrieval with Natural Language Queries 0.00 2021-05-05 📦 oncescuandreea/audio-retrieval
10 MMT Audio Retrieval with Natural Language Queries: A Benchmark Study 0.00 2021-12-17 📦 akoepke/audio-retrieval-benchmark

All Papers (12)