The ActivityNet-QA dataset contains 58,000 human-annotated QA pairs on 5,800 videos derived from the popular ActivityNet dataset. The dataset provides a benchmark for testing the performance of VideoQA models on long-term spatio-temporal reasoning.
Source: ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question Answering
Variants: ActivityNet-QA
This dataset is used in 1 benchmark:
Recent papers with results on this dataset: