Kinetics

Kinetics Human Action Video Dataset

Dataset Information
Modalities
Videos
Introduced
2017
License
Homepage

Overview

The Kinetics dataset is a large-scale, high-quality dataset for human action recognition in videos. The dataset consists of around 500,000 video clips covering 600 human action classes with at least 600 video clips for each action class. Each video clip lasts around 10 seconds and is labeled with a single action class. The videos are collected from YouTube.

Source: Self-supervised Visual Feature Learning with Deep Neural Networks: A Survey

Variants: Kinetics-GEBD, MiniKinetics, Kinetics-ZSAR, Kinetics-Sounds, Kinetics-Sound, Kinetics-GEB+, Kinetics-700, Kinetics-600 12 frames, 64x64, Kinetics-600 12 frames, 128x128, Kinetics-600, Kinetics 400, Kinetics-100, Imbalanced-MiniKinetics200, Kinetics-Skeleton dataset, AVA-Kinetics, Kinetics-600 48 frames, 64x64, Kinetics-400, Kinetics

Associated Benchmarks

This dataset is used in 4 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Zero-Shot Action Recognition TC-CLIP Leveraging Temporal Contextualization for Video … 2024-04-15
Zero-Shot Action Recognition OST OST: Refining Text Knowledge with … 2023-11-30
Zero-Shot Action Recognition LanguageBind LanguageBind: Extending Video-Language Pretraining to … 2023-10-03
Zero-Shot Action Recognition OTI(ViT-L/14) Orthogonal Temporal Interpolation for Zero-Shot … 2023-08-14
Visual Tracking TAPIR (MOVi-E) TAPIR: Tracking Any Point with … 2023-06-14
Visual Tracking TAPIR (Panning MOVi-E) TAPIR: Tracking Any Point with … 2023-06-14
Zero-Shot Action Recognition IMP-MoE-L Alternating Gradient Descent and Mixture-of-Experts … 2023-05-10
Zero-Shot Action Recognition MAXI MAtch, eXpand and Improve: Unsupervised … 2023-03-15
Zero-Shot Action Recognition BIKE Bidirectional Cross-Modal Knowledge Exploration for … 2022-12-31
Zero-Shot Action Recognition VideoCoCa VideoCoCa: Video-Text Modeling with Zero-Shot … 2022-12-09
Zero-Shot Action Recognition X-CLIP Expanding Language-Image Pretrained Models for … 2022-08-04
Zero-Shot Action Recognition Text4Vis Revisiting Classifier: Transferring Vision-Language Models … 2022-07-04
Text-to-Video Generation NUWA (128×128) NÜWA: Visual Synthesis Pre-training for … 2021-11-24
Zero-Shot Action Recognition ER-ZSAR (ST+Obj) Elaborative Rehearsal for Zero-shot Action … 2021-08-05
Zero-Shot Action Recognition ER-ZSAR (ST) Elaborative Rehearsal for Zero-shot Action … 2021-08-05
Zero-Shot Action Recognition GCN All About Knowledge Graphs for … 2020-08-28
Video Classification Multigrid A Multigrid Method for Efficiently … 2019-12-02
Zero-Shot Action Recognition DEM Learning a Deep Embedding Model … 2016-11-15
Zero-Shot Action Recognition ALE Label-Embedding for Image Classification 2015-03-30
Zero-Shot Action Recognition SJE(Word Embedding) Evaluation of Output Embeddings for … 2014-09-30

Research Papers

Recent papers with results on this dataset: