Form Understanding in Noisy Scanned Documents
Form Understanding in Noisy Scanned Documents (FUNSD) comprises 199 real, fully annotated, scanned forms. The documents are noisy and vary widely in appearance, making form understanding (FoUn) a challenging task. The proposed dataset can be used for various tasks, including text detection, optical character recognition, spatial layout analysis, and entity labeling/linking.
Source: FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents
Image source: https://guillaumejaume.github.io/FUNSD/
Variants: FUNSD
This dataset is used in 3 benchmarks:
Recent papers with results on this dataset: