Norton
|
Multi-granularity Correspondence Learning from Lo…
|
64.10
|
2024-01-30
|
|
VideoCLIP
|
VideoCLIP: Contrastive Pre-training for Zero-shot…
|
63.10
|
2021-09-28
|
|
TACo
|
TACo: Token-aware Cascade Contrastive Learning fo…
|
55.70
|
2021-08-23
|
|
VAST, HowToCaption-finetuned
|
HowToCaption: Prompting LLMs to Transform Video A…
|
53.90
|
2023-10-07
|
|
VideoCOca
|
VideoCoCa: Video-Text Modeling with Zero-Shot Tra…
|
53.30
|
2022-12-09
|
|
MIL-NCE
|
End-to-End Learning of Visual Representations fro…
|
51.20
|
2019-12-13
|
|
VATT-MBS
|
VATT: Transformers for Multimodal Self-Supervised…
|
45.50
|
2021-04-22
|
|
HowToCaption
|
HowToCaption: Prompting LLMs to Transform Video A…
|
44.10
|
2023-10-07
|
|