EPHOIE is a fully-annotated dataset which is the first Chinese benchmark for both text spotting and visual information extraction. EPHOIE consists of 1,494 images of examination paper head with complex layouts and background, including a total of 15,771 Chinese handwritten or printed text instances.
Source: Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution
Image source: https://github.com/HCIILAB/EPHOIE
Variants: EPHOIE
This dataset is used in 2 benchmarks:
Task | Model | Paper | Date |
---|---|---|---|
Document AI | LayoutLMv3 | LayoutLMv3: Pre-training for Document AI … | 2022-04-18 |
Key Information Extraction | LayoutLMv3 | LayoutLMv3: Pre-training for Document AI … | 2022-04-18 |
Recent papers with results on this dataset: