D4LA

Dataset Information
Languages
English
Introduced
2023
License
Unknown

Overview

The D4LA dataset is a diverse benchmark for document layout analysis (DLA) derived from the RVL-CDIP dataset. It focuses on 12 document types with rich layouts, each represented by approximately 1,000 manually annotated images, while filtering out noisy, handwritten, artistic, or text-scarce images. The dataset defines 27 detailed layout categories, including DocTitle, ListText, Header, Table, Equation, and Footer, among others, catering to real-world applications.

Variants: D4LA

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Document Layout Analysis DoPTA DoPTA: Improving Document Layout Analysis … 2024-12-17
Document Layout Analysis DocLayout-YOLO DocLayout-YOLO: Enhancing Document Layout Analysis … 2024-10-16
Document Layout Analysis VGT Vision Grid Transformer for Document … 2023-08-29

Research Papers

Recent papers with results on this dataset: