AVA

Atomic Visual Actions

Dataset Information
Modalities
Videos
Introduced
2018
License
Homepage

Overview

AVA is a project that provides audiovisual annotations of video for improving our understanding of human activity. Each of the video clips has been exhaustively annotated by human annotators, and together they represent a rich variety of scenes, recording conditions, and expressions of human activity. There are annotations for:

  • Kinetics (AVA-Kinetics) - a crossover between AVA and Kinetics. In order to provide localized action labels on a wider variety of visual scenes, authors provide AVA action labels on videos from Kinetics-700, nearly doubling the number of total annotations, and increasing the number of unique videos by over 500x.
  • Actions (AvA Actions) - the AVA dataset densely annotates 80 atomic visual actions in 430 15-minute movie clips, where actions are localized in space and time, resulting in 1.62M action labels with multiple labels per human occurring frequently.
  • Spoken Activity (AVA ActiveSpeaker, AVA Speech). AVA ActiveSpeaker: associates speaking activity with a visible face, on the AVA v1.0 videos, resulting in 3.65 million frames labeled across ~39K face tracks. AVA Speech densely annotates audio-based speech activity in AVA v1.0 videos, and explicitly labels 3 background noise conditions, resulting in ~46K labeled segments spanning 45 hours of data.
    Image Source: https://www.researchgate.net/profile/Paolo_Napoletano/publication/309327222/figure/fig1/AS:419620126248965@1477056642346/Sample-images-from-the-Aesthetic-Visual-Analysis-AVA-database-sorted-by-their-aesthetic.png

Variants: AVA v2.1, AVA-ActiveSpeaker, AVA-LAEO, AVA-Speech, AVA v2.2, AVA-Kinetics

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Node Classification ASDNet [ASDNet_ICCV2021] Learning Long-Term Spatial-Temporal Graphs for … 2022-07-15
Node Classification TalkNet [tao2021someone] Learning Long-Term Spatial-Temporal Graphs for … 2022-07-15
Node Classification UniCon [zhang2021unicon] Learning Long-Term Spatial-Temporal Graphs for … 2022-07-15
Node Classification MAAS-TAN [MAAS2021] Learning Long-Term Spatial-Temporal Graphs for … 2022-07-15

Research Papers

Recent papers with results on this dataset: