A large, realistic multimodal dataset consisting of real personal photos and crowd-sourced questions/answers.
Variants: MemexQA
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Memex Question Answering | FVTA | Focal Visual-Text Attention for Visual … | 2018-06-05 |
Recent papers with results on this dataset: