ClawCraneNet
|
ClawCraneNet: Leveraging Object-level Relation fo…
|
0.66
|
2021-03-19
|
|
RefVOS
|
RefVOS: A Closer Look at Referring Expressions fo…
|
0.60
|
2020-10-01
|
|
SgMg (Video-Swin-B)
|
Spectrum-guided Multi-granularity Referring Video…
|
0.59
|
2023-07-25
|
|
SOC (Video-Swin-B)
|
SOC: Semantic-Assisted Object Cluster for Referri…
|
0.57
|
2023-05-26
|
|
ReferFormer (Video-Swin-B)
|
Language as Queries for Referring Video Object Se…
|
0.55
|
2022-01-03
|
|
SOC (Video-Swin-T)
|
SOC: Semantic-Assisted Object Cluster for Referri…
|
0.50
|
2023-05-26
|
|
MANET
|
Multi-Attention Network for Compressed Video Refe…
|
0.47
|
2022-07-26
|
|
VLIDE
|
Deeply Interleaved Two-Stream Encoder for Referri…
|
0.47
|
2022-03-30
|
|
Locater
|
Local-Global Context Aware Transformer for Langua…
|
0.47
|
2022-03-18
|
|
MTTR (w=10)
|
End-to-End Referring Video Object Segmentation wi…
|
0.46
|
2021-11-29
|
|
MTTR (w=8)
|
End-to-End Referring Video Object Segmentation wi…
|
0.45
|
2021-11-29
|
|
CMSA+CFSA
|
Referring Segmentation in Images and Videos with …
|
0.43
|
2021-02-09
|
|
mmmmtbvs
|
Modeling Motion with Multi-Modal Features for Tex…
|
0.42
|
2022-04-06
|
|
CMPC-V (I3D)
|
Cross-Modal Progressive Comprehension for Referri…
|
0.40
|
2021-05-15
|
|
Hui et al.
|
Collaborative Spatial-Temporal Modeling for Langu…
|
0.40
|
2021-05-14
|
|
AAMN
|
Actor and Action Modular Network for Text-based V…
|
0.40
|
2020-11-02
|
|
CMPC-V (R2D)
|
Cross-Modal Progressive Comprehension for Referri…
|
0.35
|
2021-05-15
|
|
Gavriluyk el al. (Optical flow)
|
Actor and Action Video Segmentation from a Senten…
|
0.22
|
2018-03-20
|
|
Gavriluyk el al.
|
Actor and Action Video Segmentation from a Senten…
|
0.20
|
2018-03-20
|
|
Hu et al.
|
Segmentation from Natural Language Expressions
|
0.13
|
2016-03-20
|
|