CM²
|
Do You Remember? Dense Video Captioning with Cros…
|
55.21
|
2024-04-11
|
|
Vid2Seq
|
Vid2Seq: Large-Scale Pretraining of a Visual Lang…
|
28.00
|
2023-02-27
|
|
Bi-directional+intra captioning
|
Team RUC_AIM3 Technical Report at Activitynet 202…
|
11.28
|
2020-06-14
|
|
TSRM-CMG-HRNN+SCST
|
Dense-Captioning Events in Videos: SYSU Submissio…
|
9.71
|
2020-06-21
|
|
GVL
|
Learning Grounded Vision-Language Representation …
|
7.11
|
2023-03-11
|
|
VTimeLLM
|
VTimeLLM: Empower LLM to Grasp Video Moments
|
5.80
|
2023-11-30
|
|
PDVC (TSP features, no SCST)
|
End-to-End Dense Video Captioning with Parallel D…
|
2.17
|
2021-08-17
|
|
TSP
|
TSP: Temporally-Sensitive Pretraining of Video En…
|
2.02
|
2020-11-23
|
|
BMT
|
A Better Use of Audio-Visual Cues: Dense Video Ca…
|
1.88
|
2020-05-17
|
|
iPerceive (Chadha et al., 2020)
|
iPerceive: Applying Common-Sense Reasoning to Mul…
|
1.29
|
2020-11-16
|
|
MDVC
|
Multi-modal Dense Video Captioning
|
1.07
|
2020-03-17
|
|