MSRVTT-CTN

MSRVTT Causal-Temporal Narrative

Dataset Information
Modalities
Videos, Texts
Languages
English
Introduced
2024
License
Creative Commons Attribution Non Commercial No Derivatives 4.0
Homepage

Overview

MSRVTT-CTN Dataset

This dataset contains CTN annotations for the MSRVTT-CTN benchmark dataset in JSON format. It has three files for the train, test, and validation splits.
For project details, visit https://narrativebridge.github.io/.

Dataset Structure

Each JSON file contains a dictionary where the keys are the video IDs and the values are the corresponding Causal-Temporal Narrative (CTN) captions. The CTN captions are represented as a dictionary with two keys: "Cause" and "Effect", containing the cause and effect statements, respectively.

Example:

{
  "video_id_1": {
    "Cause": "a person performed an action",
    "Effect": "a specific outcome occurred"
  },
  "video_id_2": {
    "Cause": "another cause statement",
    "Effect": "another effect statement"
  }
}

Loading the Datasets

To load the datasets, use a JSON parsing library in your preferred programming language. For example, in Python, you can use the json module:

import json

with open("MSRVTT_CTN_train.json", "r") as f:
    msrvtt_train_data = json.load(f)

# Access the CTN captions
for video_id, ctn_caption in msrvtt_train_data.items():
    cause = ctn_caption["Cause"]
    effect = ctn_caption["Effect"]
    # Process the cause and effect statements as needed

License

The MSRVTT-CTN benchmark dataset is licensed under the Creative Commons Attribution Non Commercial No Derivatives 4.0 International (CC BY-NC-ND 4.0) license.

Variants: MSRVTT-CTN

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Video Captioning CEN NarrativeBridge: Enhancing Video Captioning with … 2024-06-10
Video Captioning GIT GiT: Towards Generalist Vision Transformer … 2024-03-14
Video Captioning SEM-POS SEM-POS: Grammatically and Semantically Correct … 2023-03-26

Research Papers

Recent papers with results on this dataset: