COIN

Dataset Information
Modalities
Videos
License
Homepage

Overview

The COIN dataset (a large-scale dataset for COmprehensive INstructional video analysis) consists of 11,827 videos related to 180 different tasks in 12 domains (e.g., vehicles, gadgets, etc.) related to our daily life. The videos are all collected from YouTube. The average length of a video is 2.36 minutes. Each video is labelled with 3.91 step segments, where each segment lasts 14.91 seconds on average. In total, the dataset contains videos of 476 hours, with 46,354 annotated segments.

Source: COIN: A Large-scale Dataset for Comprehensive Instructional Video Analysis

Variants: COIN

Associated Benchmarks

This dataset is used in 2 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Video Classification HERMES HERMES: temporal-coHERent long-forM understanding with … 2024-08-30
Video Classification MA-LMM MA-LMM: Memory-Augmented Large Multimodal Model … 2024-04-08
Action Segmentation Norton Multi-granularity Correspondence Learning from Long-term … 2024-01-30
Action Segmentation UnLoc-L UnLoc: A Unified Framework for … 2023-08-21
Video Classification S5 Selective Structured State-Spaces for Long-Form … 2023-03-25
Video Classification TranS4mer Efficient Movie Scene Detection using … 2022-12-29
Video Classification ViS4mer Long Movie Clip Classification with … 2022-04-04
Video Classification D-Sprv. Learning To Recognize Procedural Activities … 2022-01-26
Action Segmentation VideoClip VideoCLIP: Contrastive Pre-training for Zero-shot … 2021-09-28
Action Segmentation TACo TACo: Token-aware Cascade Contrastive Learning … 2021-08-23
Action Segmentation VLM VLM: Task-agnostic Video-Language Model Pre-training … 2021-05-20
Action Segmentation ActBERT ActBERT: Learning Global-Local Video-Text Representations 2020-11-14
Action Segmentation Univl UniVL: A Unified Video and … 2020-02-15
Action Segmentation CBT End-to-End Learning of Visual Representations … 2019-12-13
Action Segmentation MIL-NCE End-to-End Learning of Visual Representations … 2019-12-13
Video Classification TSN Temporal Segment Networks for Action … 2017-05-08

Research Papers

Recent papers with results on this dataset: