SLAM-AAC
|
SLAM-AAC: Enhancing Audio Captioning with Paraphr…
|
0.84
|
2024-10-12
|
|
AutoCap
|
Taming Data and Transformers for Audio Generation
|
0.83
|
2024-06-27
|
|
EnCLAP++-large
|
EnCLAP++: Analyzing the EnCLAP Framework for Opti…
|
0.82
|
2024-09-02
|
|
LOAE
|
Enhancing Automated Audio Captioning via Large La…
|
0.82
|
2024-06-19
|
|
EnCLAP++-base
|
EnCLAP++: Analyzing the EnCLAP Framework for Opti…
|
0.82
|
2024-09-02
|
|
EnCLAP-large
|
EnCLAP: Combining Neural Audio Codec and Audio-Te…
|
0.80
|
2024-01-31
|
|
EnCLAP-base
|
EnCLAP: Combining Neural Audio Codec and Audio-Te…
|
0.78
|
2024-01-31
|
|
AL-MixGen
|
Exploring Train and Test-Time Augmentations for A…
|
0.76
|
2022-10-31
|
|
CNN+Transformer
|
Audio Captioning Transformer
|
0.69
|
2021-07-21
|
|
MQ-Cap
|
Enhancing Retrieval-Augmented Audio Captioning wi…
|
0.30
|
2024-10-14
|
|
LAVCap
|
LAVCap: LLM-based Audio-Visual Captioning using O…
|
0.30
|
2025-01-16
|
|
VAST
|
VAST: A Vision-Audio-Subtitle-Text Omni-Modality …
|
0.30
|
2023-05-29
|
|
VALOR
|
VALOR: Vision-Audio-Language Omni-Perception Pret…
|
0.27
|
2023-04-17
|
|