VLEP

Name: VLEP
Published: 2020-10-15
License: Unknown

Video-and-Language Event Prediction

Dataset Information

Modalities

Videos, Texts

Introduced

2020

License

Unknown

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

VLEP contains 28,726 future event prediction examples (along with their rationales) from 10,234 diverse TV Show and YouTube Lifestyle Vlog video clips. Each example (see Figure 1) consists of a Premise Event (a short video clip with dialogue), a Premise Summary (a text summary of the premise event), and two potential natural language Future Events (along with Rationales) written by people. These clips are on average 6.1 seconds long and are harvested from diverse event-rich sources, i.e., TV show and YouTube Lifestyle Vlog videos.

Source: What is More Likely to Happen Next? Video-and-Language Future Event Prediction

Variants: VLEP

Associated Benchmarks

This dataset is used in 1 benchmark:

Video Question Answering - Metrics: Accuracy

Recent Benchmark Submissions

Task	Model	Paper	Date
Video Question Answering	LLaMA-VQA	Large Language Models are Temporal …	2023-10-24

Research Papers

Recent papers with results on this dataset:

Large Language Models are Temporal and Causal Reasoners for Video Question Answering (2023) -

External Links:

VLEP

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview