LSMDC

Large Scale Movie Description Challenge

Dataset Information
Modalities
Videos, Texts, Audio
Introduced
2015
License
Homepage

Overview

This dataset contains 118,081 short video clips extracted from 202 movies. Each video has a caption, either extracted from the movie script or from transcribed DVS (descriptive video services) for the visually impaired. The validation set contains 7408 clips and evaluation is performed on a test set of 1000 videos from movies disjoint from the training and val sets.

Source: Use What You Have: Video Retrieval Using Representations From Collaborative Experts
Image Source: https://sites.google.com/site/describingmovies/

Variants: LSMDC

Associated Benchmarks

This dataset is used in 3 benchmarks:

  • Zero-Shot Learning -
  • Video Retrieval -
  • Zero-Shot Video Retrieval -

Recent Benchmark Submissions

Task Model Paper Date
Video Retrieval InternVideo2-6B InternVideo2: Scaling Foundation Models for … 2024-03-22
Zero-Shot Video Retrieval InternVideo2-6B InternVideo2: Scaling Foundation Models for … 2024-03-22
Zero-Shot Video Retrieval InternVideo2-1B InternVideo2: Scaling Foundation Models for … 2024-03-22
Video Retrieval vid-TLDR (UMT-L) vid-TLDR: Training Free Token merging … 2024-03-20
Zero-Shot Video Retrieval VAST, HowToCaption-finetuned HowToCaption: Prompting LLMs to Transform … 2023-10-07
Zero-Shot Video Retrieval HowToCaption HowToCaption: Prompting LLMs to Transform … 2023-10-07
Zero-Shot Video Retrieval BT-Adapter BT-Adapter: Video Conversation is Feasible … 2023-09-27
Video Retrieval COSA COSA: Concatenated Sample Pretrained Vision-Language … 2023-06-15
Video Retrieval VALOR VALOR: Vision-Audio-Language Omni-Perception Pretraining Model … 2023-04-17
Video Retrieval UMT-L (ViT-L/16) Unmasked Teacher: Towards Training-Efficient Video … 2023-03-28
Zero-Shot Video Retrieval UMT-L (ViT-L/16) Unmasked Teacher: Towards Training-Efficient Video … 2023-03-28
Video Retrieval DiffusionRet DiffusionRet: Generative Text-Video Retrieval with … 2023-03-17
Zero-Shot Video Retrieval mPLUG-2 mPLUG-2: A Modularized Multi-modal Foundation … 2023-02-01
Video Retrieval mPLUG-2 mPLUG-2: A Modularized Multi-modal Foundation … 2023-02-01
Video Retrieval STAN Revisiting Temporal Modeling for CLIP-based … 2023-01-26
Video Retrieval HiTeA HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training 2022-12-30
Zero-Shot Video Retrieval HiTeA-5M HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training 2022-12-30
Zero-Shot Video Retrieval HiTeA-17M HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training 2022-12-30
Video Retrieval InternVideo InternVideo: General Video Foundation Models … 2022-12-06
Zero-Shot Video Retrieval InternVideo InternVideo: General Video Foundation Models … 2022-12-06

Research Papers

Recent papers with results on this dataset: