UCF101 Human Actions dataset
UCF101 dataset is an extension of UCF50 and consists of 13,320 video clips, which are classified into 101 categories. These 101 categories can be classified into 5 types (Body motion, Human-human interactions, Human-object interactions, Playing musical instruments and Sports). The total length of these video clips is over 27 hours. All the videos are collected from YouTube and have a fixed frame rate of 25 FPS with the resolution of 320 × 240.
Source: Two-stream Collaborative Learning with Spatial-Temporal Attention for Video Classification
Image Source: https://www.crcv.ucf.edu/data/UCF101.php
Variants: UCF101-skeleton, UCF101-MiTv2, UCF-101 Zero-shot, 256x256, class-conditional, UCF101 (finetuned), UCF 101, UCF101, UCF-101 16 frames, Unconditional, Single GPU, UCF-101 16 frames, 64x64, Unconditional, UCF-101 16 frames, 128x128, Unconditional, UCF-101
This dataset is used in 8 benchmarks:
Recent papers with results on this dataset: