TvSum

TVSum: Summarizing Web Videos Using Titles

Dataset Information
Introduced
2015
Homepage

Overview

Introduced by Song et al. in TVSum: Summarizing web videos using titles.

The TVSum dataset comprises 50 videos, with durations ranging from 1 to 11 minutes. These videos belong to 10 different categories associated with the TRECVid MED task, with 5 videos in each category, and were collected from YouTube. The video categories include various activities like changing a vehicle tire, making a sandwich, and flash mob gatherings. For annotation, each video was reviewed and rated by 20 users, who assigned frame-level importance scores on a scale from 1 (not important) to 5 (very important).

Variants: TvSum

Associated Benchmarks

This dataset is used in 2 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Highlight Detection FlashVTG FlashVTG: Feature Layering and Adaptive … 2024-12-18
Highlight Detection SG-DETR Saliency-Guided DETR for Moment Retrieval … 2024-10-02
Video Summarization CSTA CSTA: CNN-based Spatiotemporal Attention for … 2024-05-20
Highlight Detection UVCOM (train from scratch) Bridging the Gap: A Unified … 2023-11-28
Highlight Detection CG-DETR Correlation-Guided Query-Dependency Calibration for Video … 2023-11-15
Highlight Detection QD-DETR (only Video) Query-Dependent Video Representation for Moment … 2023-03-24
Highlight Detection QD-DETR Query-Dependent Video Representation for Moment … 2023-03-24
Highlight Detection UMT UMT: Unified Multi-modal Transformers for … 2022-03-23
Video Summarization VASNet Summarizing Videos with Attention 2018-12-05
Video Summarization M-AVS Video Summarization with Attention-Based Encoder-Decoder … 2017-08-31

Research Papers

Recent papers with results on this dataset: