COCO-Text

Dataset Information
Modalities
Images, Texts
Languages
English
Introduced
2016
Homepage

Overview

The COCO-Text dataset is a dataset for text detection and recognition. It is based on the MS COCO dataset, which contains images of complex everyday scenes. The COCO-Text dataset contains non-text images, legible text images and illegible text images. In total there are 22184 training images and 7026 validation images with at least one instance of legible text.

Source: Improving Text Proposals for Scene Images with Fully Convolutional Networks
Image Source: https://vision.cornell.edu/se3/coco-text-2/

Variants: COCO-Text

Associated Benchmarks

This dataset is used in 2 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Scene Text Recognition CLIP4STR-L CLIP4STR: A Simple Baseline for … 2023-05-23
Scene Text Recognition CLIP4STR-B CLIP4STR: A Simple Baseline for … 2023-05-23
Scene Text Recognition MGP-STR Multi-Granularity Prediction for Scene Text … 2022-09-08
Scene Text Recognition PARSeq Scene Text Recognition with Permuted … 2022-07-14
Scene Text Detection Corner-based Region Proposals Detecting Multi-Oriented Text with Corner-based … 2018-04-08
Scene Text Detection TextBoxes++_MS TextBoxes++: A Single-Shot Oriented Scene … 2018-01-09
Scene Text Detection SSTD Single Shot Text Detector with … 2017-09-01
Scene Text Detection WordSup (VGG16-synth-coco) WordSup: Exploiting Word Annotations for … 2017-08-22
Scene Text Detection EAST + VGG16 EAST: An Efficient and Accurate … 2017-04-11
Scene Text Detection Yao et al. Scene Text Detection via Holistic, … 2016-06-29

Research Papers

Recent papers with results on this dataset: