CC3M-TagMask

Dataset Information
Modalities
Images, Texts
Languages
English
Introduced
2024
License
Unknown
Homepage

Overview

The dataset offers tag and mask annotations for image-text pairs from the CC3M validation set. Tag annotations denote words that aptly describe the relationship between the image and the corresponding text. These annotations provide valuable insights into the semantic connection between each pair's visual and textual elements.

Variants: CC3M-TagMask

Associated Benchmarks

This dataset is used in 2 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Semantic Segmentation TTD (MaskCLIP) TTD: Text-Tag Self-Distillation Enhancing Image-Text … 2024-03-30
Semantic Segmentation TTD (TCL) TTD: Text-Tag Self-Distillation Enhancing Image-Text … 2024-03-30
Multi-Label Text Classification TTD (w/ fine-tuning) TTD: Text-Tag Self-Distillation Enhancing Image-Text … 2024-03-30
Multi-Label Text Classification TTD (w/o fine-tuning) TTD: Text-Tag Self-Distillation Enhancing Image-Text … 2024-03-30
Multi-Label Text Classification Qwen-72B Qwen Technical Report 2023-09-28
Semantic Segmentation TCL Learning to Generate Text-grounded Mask … 2022-12-01
Semantic Segmentation MaskCLIP Extract Free Dense Labels from … 2021-12-02
Multi-Label Text Classification NLTK NLTK: The Natural Language Toolkit 2002-05-17

Research Papers

Recent papers with results on this dataset: