ML Research Wiki / Benchmarks / Highlight Detection / QVHighlights

QVHighlights

Highlight Detection Benchmark

Performance Over Time

📊 Showing 20 results | 📏 Metric: mAP

Top Performing Models

Rank	Model	Paper	mAP	Date	Code
1	QD-DETR (only Video w/ PT)	Query-Dependent Video Representation for Moment Retrieval and Highlight Detection	61.91	2023-03-24	📦 wjun0830/qd-detr
2	SG-DETR (w/ PT) 📚	Saliency-Guided DETR for Moment Retrieval and Highlight Detection	44.70	2024-10-02	📦 ai-forever/sg-detr
3	FlashVTG	FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding	44.09	2024-12-18	📦 zhuo-cao/flashvtg
4	SG-DETR	Saliency-Guided DETR for Moment Retrieval and Highlight Detection	43.76	2024-10-02	📦 ai-forever/sg-detr
5	VideoLights-B-pt 📚	VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval	42.84	2024-12-02	📦 dpaul06/VideoLights
6	HL-CLIP	Unleash the Potential of CLIP for Video Highlight Detection	41.94	2024-04-02	📦 dhk1349/HL-CLIP
7	R^2-Tuning	$R^2$-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding	40.75	2024-03-31	📦 yeliudev/R2-Tuning
8	CG-DETR (w/ PT) 📚	Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding	40.71	2023-11-15	📦 wjun0830/qd-detr 📦 wjun0830/cgdetr
9	NumPro	Number it: Temporal Grounding Videos like Flipping Manga	40.54	2024-11-15	📦 yongliang-wu/numpro
10	UniVTG (w/ PT) 📚	UniVTG: Towards Unified Video-Language Temporal Grounding	40.54	2023-07-31	📦 showlab/univtg

All Papers (20)

Query-Dependent Video Representation for Moment Retrieval and Highlight Detection

2023

QD-DETR (only Video w/ PT)

wjun0830/qd-detr

Saliency-Guided DETR for Moment Retrieval and Highlight Detection

2024

SG-DETR (w/ PT)

ai-forever/sg-detr

FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding

2024

FlashVTG

zhuo-cao/flashvtg

Saliency-Guided DETR for Moment Retrieval and Highlight Detection

2024

SG-DETR

ai-forever/sg-detr

VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval

2024

VideoLights-B-pt

dpaul06/VideoLights

Unleash the Potential of CLIP for Video Highlight Detection

2024

HL-CLIP

dhk1349/HL-CLIP

$R^2$-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

2024

R^2-Tuning

yeliudev/R2-Tuning

Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding

2023

CG-DETR (w/ PT)

wjun0830/qd-detr wjun0830/cgdetr

Number it: Temporal Grounding Videos like Flipping Manga

2024

NumPro

yongliang-wu/numpro

UniVTG: Towards Unified Video-Language Temporal Grounding

2023

UniVTG (w/ PT)

showlab/univtg

Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding

2023

CG-DETR

wjun0830/qd-detr wjun0830/cgdetr

Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval

2024

LLMEPET

fletcherjiang/llmepet

UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection

2022

UMT (w. PT)

tencentarc/umt MindCode-4/code-5 MS-P3/code7

Query-Dependent Video Representation for Moment Retrieval and Highlight Detection

2023

QD-DETR

wjun0830/qd-detr

Query-Dependent Video Representation for Moment Retrieval and Highlight Detection

2023

QD-DETR (only Video)

wjun0830/qd-detr

Query-Dependent Video Representation for Moment Retrieval and Highlight Detection

2023

QD-DETR (w/ PT)

wjun0830/qd-detr

UniVTG: Towards Unified Video-Language Temporal Grounding

2023

UniVTG

showlab/univtg

UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection

2022

UMT

tencentarc/umt MindCode-4/code-5 MS-P3/code7

QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries

2021

Moment-DETR w/ PT

jayleicn/moment_detr tencentarc/umt

TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning

2024

VideoChat-T (FT)

OpenGVLab/TimeSuite

QVHighlights

Performance Over Time

Edit Benchmark Results

Edit Result

Top Performing Models

All Papers (20)

Query-Dependent Video Representation for Moment Retrieval and Highlight Detection

Saliency-Guided DETR for Moment Retrieval and Highlight Detection

FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding

Saliency-Guided DETR for Moment Retrieval and Highlight Detection

VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval

Unleash the Potential of CLIP for Video Highlight Detection

$R^2$-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding

Number it: Temporal Grounding Videos like Flipping Manga

UniVTG: Towards Unified Video-Language Temporal Grounding

Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding

Prior Knowledge Integration via LLM Encoding and Pseudo Event Regulation for Video Moment Retrieval

UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection

Query-Dependent Video Representation for Moment Retrieval and Highlight Detection

Query-Dependent Video Representation for Moment Retrieval and Highlight Detection

Query-Dependent Video Representation for Moment Retrieval and Highlight Detection

UniVTG: Towards Unified Video-Language Temporal Grounding

UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection

QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries

TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning

Model	Paper	mAP	Date
QD-DETR (only Video w/ PT)	Query-Dependent Video Representation for Moment R…	61.91	2023-03-24
SG-DETR (w/ PT)	Saliency-Guided DETR for Moment Retrieval and Hig…	44.70	2024-10-02
FlashVTG	FlashVTG: Feature Layering and Adaptive Score Han…	44.09	2024-12-18
SG-DETR	Saliency-Guided DETR for Moment Retrieval and Hig…	43.76	2024-10-02
VideoLights-B-pt	VideoLights: Feature Refinement and Cross-Task Al…	42.84	2024-12-02
HL-CLIP	Unleash the Potential of CLIP for Video Highlight…	41.94	2024-04-02
R^2-Tuning	$R^2$-Tuning: Efficient Image-to-Video Transfer L…	40.75	2024-03-31
CG-DETR (w/ PT)	Correlation-Guided Query-Dependency Calibration f…	40.71	2023-11-15
NumPro	Number it: Temporal Grounding Videos like Flippin…	40.54	2024-11-15
UniVTG (w/ PT)	UniVTG: Towards Unified Video-Language Temporal G…	40.54	2023-07-31
CG-DETR	Correlation-Guided Query-Dependency Calibration f…	40.33	2023-11-15
LLMEPET	Prior Knowledge Integration via LLM Encoding and …	40.33	2024-07-21
UMT (w. PT)	UMT: Unified Multi-modal Transformers for Joint V…	39.12	2022-03-23
QD-DETR	Query-Dependent Video Representation for Moment R…	39.04	2023-03-24
QD-DETR (only Video)	Query-Dependent Video Representation for Moment R…	38.94	2023-03-24
QD-DETR (w/ PT)	Query-Dependent Video Representation for Moment R…	38.52	2023-03-24
UniVTG	UniVTG: Towards Unified Video-Language Temporal G…	38.20	2023-07-31
UMT	UMT: Unified Multi-modal Transformers for Joint V…	38.18	2022-03-23
Moment-DETR w/ PT	QVHighlights: Detecting Moments and Highlights in…	37.43	2021-07-20
VideoChat-T (FT)	TimeSuite: Improving MLLMs for Long Video Underst…	27.00	2024-10-25