VidHOI is a video-based human-object interaction detection benchmark. VidHOI is based on VidOR which is densely annotated with all humans and predefined objects showing up in each frame. VidOR is also more challenging as the videos are non-volunteering user-generated and thus jittery at times.
Image source: https://xdshang.github.io/docs/vidor.html
Variants: VidHOI
This dataset is used in 2 benchmarks:
Task | Model | Paper | Date |
---|---|---|---|
Human-Object Interaction Detection | HOI4ABOT | HOI4ABOT: Human-Object Interaction Anticipation for … | 2023-09-28 |
Human-Object Interaction Anticipation | HOI4ABOT | HOI4ABOT: Human-Object Interaction Anticipation for … | 2023-09-28 |
Human-Object Interaction Detection | ST-GAZE | Human-Object Interaction Prediction in Videos … | 2023-06-06 |
Human-Object Interaction Anticipation | ST-GAZE | Human-Object Interaction Prediction in Videos … | 2023-06-06 |
Human-Object Interaction Detection | STTRAN | ST-HOI: A Spatial-Temporal Baseline for … | 2021-05-25 |
Human-Object Interaction Anticipation | STTRAN | ST-HOI: A Spatial-Temporal Baseline for … | 2021-05-25 |
Recent papers with results on this dataset: