ML Research Wiki / Benchmarks / Dense Video Captioning / ActivityNet Captions

ActivityNet Captions

Dense Video Captioning Benchmark

Performance Over Time

📊 Showing 11 results | 📏 Metric: METEOR

Top Performing Models

Rank Model Paper METEOR Date Code
1 CM² Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval 55.21 2024-04-11 📦 ailab-kyunghee/cm2_dvc
2 Vid2Seq 📚 Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning 28.00 2023-02-27 📦 google-research/scenic 📦 antoyang/VidChapters 📦 KastanDay/video-pretrained-transformer
3 Bi-directional+intra captioning Team RUC_AIM3 Technical Report at Activitynet 2020 Task 2: Exploring Sequential Events Detection for Dense Video Captioning 11.28 2020-06-14 -
4 TSRM-CMG-HRNN+SCST Dense-Captioning Events in Videos: SYSU Submission to ActivityNet Challenge 2020 9.71 2020-06-21 📦 ttengwang/dense-video-captioning-pytorch
5 GVL Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos 7.11 2023-03-11 📦 zjr2000/gvl
6 VTimeLLM VTimeLLM: Empower LLM to Grasp Video Moments 5.80 2023-11-30 📦 huangb23/vtimellm
7 PDVC (TSP features, no SCST) End-to-End Dense Video Captioning with Parallel Decoding 2.17 2021-08-17 📦 ttengwang/pdvc 📦 aim3-ruc/youmakeup_challenge2022
8 TSP TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks 2.02 2020-11-23 📦 HumamAlwassel/TSP
9 BMT A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer 1.88 2020-05-17 📦 v-iashin/video_features 📦 v-iashin/BMT
10 iPerceive (Chadha et al., 2020) iPerceive: Applying Common-Sense Reasoning to Multi-Modal Dense Video Captioning and Video Question Answering 1.29 2020-11-16 -

All Papers (11)