InternVideo2-6B
|
InternVideo2: Scaling Foundation Models for Multi…
|
55.20
|
2024-03-22
|
|
VAST
|
VAST: A Vision-Audio-Subtitle-Text Omni-Modality …
|
52.00
|
2023-05-29
|
|
ONE-PEACE
|
ONE-PEACE: Exploring One General Representation M…
|
42.50
|
2023-05-18
|
|
VALOR
|
VALOR: Vision-Audio-Language Omni-Perception Pret…
|
40.10
|
2023-04-17
|
|
AL-MixGen + Multi-TTA
|
Exploring Train and Test-Time Augmentations for A…
|
34.70
|
2022-10-31
|
|
QB-Norm+CE
|
Cross Modal Retrieval with Querybank Normalisation
|
23.90
|
2021-12-23
|
|
MMT
|
Audio Retrieval with Natural Language Queries: A …
|
|
2021-12-17
|
|
CE
|
Audio Retrieval with Natural Language Queries: A …
|
|
2021-12-17
|
|
CE
|
Audio Retrieval with Natural Language Queries
|
|
2021-05-05
|
|
MoEE
|
Audio Retrieval with Natural Language Queries: A …
|
|
2021-12-17
|
|
MoEE
|
Audio Retrieval with Natural Language Queries
|
|
2021-05-05
|
|