STAR/L
|
End-to-End Spatio-Temporal Action Localisation wi…
|
90.30
|
2023-04-24
|
|
SiA
|
Scaling Open-Vocabulary Action Detection
|
88.50
|
2025-04-04
|
|
YOWO + LFB
|
You Only Watch Once: A Unified CNN Architecture f…
|
87.30
|
2019-11-15
|
|
HIT
|
Holistic Interaction Transformer Network for Acti…
|
84.80
|
2022-10-23
|
|
YOWO
|
You Only Watch Once: A Unified CNN Architecture f…
|
80.40
|
2019-11-15
|
|
Two-in-one Two Stream
|
Dance with Flow: Two-in-One Stream Action Detecti…
|
78.48
|
2019-04-01
|
|
MOC
|
Actions as Moving Points
|
77.80
|
2020-01-14
|
|
Faster-RCNN + two-stream I3D conv
|
AVA: A Video Dataset of Spatio-temporally Localiz…
|
76.30
|
2017-05-23
|
|
Two-in-one
|
Dance with Flow: Two-in-One Stream Action Detecti…
|
75.48
|
2019-04-01
|
|
STEP
|
STEP: Spatio-Temporal Progressive Learning for Vi…
|
75.00
|
2019-04-19
|
|
Stable Mean Teacher (I3D)
|
Stable Mean Teacher for Semi-supervised Video Act…
|
73.90
|
2024-12-10
|
|
TACNet
|
TACNet: Transition-Aware Context Network for Spat…
|
72.10
|
2019-05-31
|
|
E2E-SSL (I3D)
|
End-to-End Semi-Supervised Learning for Video Act…
|
69.90
|
2022-03-08
|
|
DTS
|
Finding Action Tubes with a Sparse-to-Dense Frame…
|
54.00
|
2020-08-30
|
|
T-CNN
|
Tube Convolutional Neural Network (T-CNN) for Act…
|
41.37
|
2017-03-30
|
|