The How2 dataset contains 13,500 videos, or 300 hours of speech, and is split into 185,187 training, 2022 development (dev), and 2361 test utterances. It has subtitles in English and crowdsourced Portuguese translations.
Source: exploring multiview correlations in open-domain videos
Variants: How2, How2 300h
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Text Summarization | BertSum | Abstractive Summarization of Spoken and … | 2020-08-21 |
Text Summarization | Ground-truth transcript + Action with Hierarchical Attn | Multimodal Abstractive Summarization for How2 … | 2019-06-19 |
Recent papers with results on this dataset: