RVL-CDIP

Dataset Information
Modalities
Images
Languages
Spanish
Introduced
2015
License
Unknown
Homepage

Overview

The RVL-CDIP dataset consists of scanned document images belonging to 16 classes such as letter, form, email, resume, memo, etc. The dataset has 320,000 training, 40,000 validation and 40,000 test images. The images are characterized by low quality, noise, and low resolution, typically 100 dpi.

Source: Towards a Multi-modal, Multi-task Learning based Pre-training Framework for Document Representation Learning
Image Source: https://www.cs.cmu.edu/~aharley/rvl-cdip/

Variants: RVL-CDIP, rvl_cdip

Associated Benchmarks

This dataset is used in 2 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Document Image Classification DoPTA DoPTA: Improving Document Layout Analysis … 2024-12-17
Document Image Classification TransferDoc GlobalDoc: A Cross-Modal Vision-Language Framework … 2023-09-11
Document Image Classification EAML EAML: Ensemble Self-Attention-based Mutual Learning … 2023-05-11
Document Image Classification StrucTexTv2 (small) StrucTexTv2: Masked Visual-Textual Prediction for … 2023-03-01
Document Image Classification StrucTexTv2 (large) StrucTexTv2: Masked Visual-Textual Prediction for … 2023-03-01
Document Image Classification Multimodal (ResNet50) Multimodal Side-Tuning for Document Classification 2023-01-16
Document Image Classification Multimodal (MobileNetV2) Multimodal Side-Tuning for Document Classification 2023-01-16
Document Image Classification VLCDoC VLCDoC: Vision-Language Contrastive Pre-Training Model … 2022-05-24
Document Image Classification LayoutLMV3Large LayoutLMv3: Pre-training for Document AI … 2022-04-18
Document Image Classification LayoutLMv3BASE LayoutLMv3: Pre-training for Document AI … 2022-04-18
Document Image Classification DiT-L DiT: Self-supervised Pre-training for Document … 2022-03-04
Document Image Classification DiT-B DiT: Self-supervised Pre-training for Document … 2022-03-04
Document Image Classification LiLT[EN-R]BASE LiLT: A Simple yet Effective … 2022-02-28
Document Image Classification Donut OCR-free Document Understanding Transformer 2021-11-30
Document Image Classification DocFormerBASE DocFormer: End-to-End Transformer for Document … 2021-06-22
Document Image Classification DocFormer large DocFormer: End-to-End Transformer for Document … 2021-06-22
Document Image Classification BEiT-B BEiT: BERT Pre-Training of Image … 2021-06-15
Document Image Classification LayoutXLM LayoutXLM: Multimodal Pre-training for Multilingual … 2021-04-18
Document Image Classification TILT-Large Going Full-TILT Boogie on Document … 2021-02-18
Document Image Classification TILT-Base Going Full-TILT Boogie on Document … 2021-02-18

Research Papers

Recent papers with results on this dataset: