WebVid contains 10 million video clips with captions, sourced from the web. The videos are diverse and rich in their content.
Both the full 10M set and a 2.5M subset is available for download:
https://github.com/m-bain/webvid-dataset
Variants: WebVid
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Text-to-Video Generation | VideoFactory | Swap Attention in Spatiotemporal Diffusions … | 2023-05-18 |
Recent papers with results on this dataset: