EgoSchema

Name: EgoSchema
Published: 2023-08-17
License: Unknown

Dataset Information

Modalities

Videos

Introduced

2023

License

Unknown

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

EgoSchema is very long-form video question-answering dataset, and benchmark to evaluate long video understanding capabilities of modern vision and language systems. Derived from Ego4D, EgoSchema consists of over 5000 human curated multiple choice question answer pairs, spanning over 250 hours of real video data, covering a very broad range of natural human activity and behavior.

Variants: EgoSchema, EgoSchema (subset), EgoSchema (fullset)

Associated Benchmarks

This dataset is used in 1 benchmark:

Visual Question Answering (VQA) - Metrics: Acc

Recent Benchmark Submissions

Task	Model	Paper	Date
Visual Question Answering (VQA)	Lyra-Pro	Lyra: An Efficient and Speech-Centric …	2024-12-12

Research Papers

Recent papers with results on this dataset:

Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition (2024) -

External Links:

EgoSchema

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview