MSRVTT-MC

Dataset Information
Introduced
2018
License
Unknown
Homepage

Overview

The MSRVTT-MC (Multiple Choice) dataset is a video question-answering dataset created based on the MSR-VTT dataset. It consists of 2,990 questions generated from 10,000 video clips with associated ground truth captions. For each question, there are five candidate captions, including the ground truth caption and four randomly sampled negative choices. The objective of the dataset is to choose the correct answer from the five candidate captions.

Variants: MSRVTT-MC, MSR-VTT-MC

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Video Question Answering Norton Multi-granularity Correspondence Learning from Long-term … 2024-01-30
Video Question Answering HiTeA HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training 2022-12-30
Video Question Answering VindLU VindLU: A Recipe for Effective … 2022-12-09
Video Question Answering VIOLETv2 An Empirical Study of End-to-End … 2022-09-04
Video Question Answering Clover Clover: Towards A Unified Video-Language … 2022-07-16
Video Question Answering Singularity-temporal Revealing Single Frame Bias for … 2022-06-07
Video Question Answering Singularity Revealing Single Frame Bias for … 2022-06-07

Research Papers

Recent papers with results on this dataset: