VAST
|
VAST: A Vision-Audio-Subtitle-Text Omni-Modality …
|
19.00
|
2023-05-29
|
|
MQ-Cap
|
Enhancing Retrieval-Augmented Audio Captioning wi…
|
18.10
|
2024-10-14
|
|
Audio Flamingo (Pengi trainset)
|
Audio Flamingo: A Novel Audio Language Model with…
|
17.40
|
2024-02-02
|
|
VALOR
|
VALOR: Vision-Audio-Language Omni-Perception Pret…
|
16.20
|
2023-04-17
|
|
SLAM-AAC
|
SLAM-AAC: Enhancing Audio Captioning with Paraphr…
|
0.52
|
2024-10-12
|
|
LOAE
|
Enhancing Automated Audio Captioning via Large La…
|
0.51
|
2024-06-19
|
|
Qwen-Audio
|
Qwen-Audio: Advancing Universal Audio Understandi…
|
0.44
|
2023-11-14
|
|
Ensemble
|
The NTT DCASE2020 Challenge Task 6 system: Automa…
|
0.32
|
2020-07-01
|
|
RNN-GRU-EncDec + VGGish + Word2Vec
|
Audio Captioning using Gated Recurrent Units
|
0.18
|
2020-06-05
|
|