COCO-CN

Dataset Information
Modalities
Images
Languages
Chinese
Introduced
2018
License
Unknown
Homepage

Overview

COCO-CN is a bilingual image description dataset enriching MS-COCO with manually written Chinese sentences and tags. The new dataset can be used for multiple tasks including image tagging, captioning and retrieval, all in a cross-lingual setting.

Source: COCO-CN

Variants: COCO-CN

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Image Retrieval CN-CLIP (RN50) Chinese CLIP: Contrastive Vision-Language Pretraining … 2022-11-02
Image Retrieval CN-CLIP (ViT-L/14@336px) Chinese CLIP: Contrastive Vision-Language Pretraining … 2022-11-02
Image Retrieval CN-CLIP (ViT-L/14) Chinese CLIP: Contrastive Vision-Language Pretraining … 2022-11-02
Image Retrieval CN-CLIP (ViT-B/16) Chinese CLIP: Contrastive Vision-Language Pretraining … 2022-11-02
Image Retrieval CN-CLIP (ViT-H/14) Chinese CLIP: Contrastive Vision-Language Pretraining … 2022-11-02
Image Retrieval R2D2 (ViT-B) CCMB: A Large-scale Chinese Cross-modal … 2022-05-08
Image Retrieval R2D2 (ViT-L/14) CCMB: A Large-scale Chinese Cross-modal … 2022-05-08
Image Retrieval Wukong (ViT-L/14) Wukong: A 100 Million Large-scale … 2022-02-14
Image Retrieval Wukong (ViT-B/32) Wukong: A 100 Million Large-scale … 2022-02-14

Research Papers

Recent papers with results on this dataset: