DocRED-IE

Dataset Information
Modalities
Texts
Languages
English
Introduced
2024
License
MIT
Homepage

Overview

The DocRED Information Extraction (DocRED-IE) dataset extends the DocRED dataset for the Document-level Closed Information Extraction (DocIE) task. DocRED-IE is a multi-task dataset and allows for 5 subtasks: (i) Document-level Relation Extraction, (ii) Mention Detection, (iii) Entity Typing, (iv) Entity Disambiguation, (v) Coreference Resolution, as well as combinations thereof such as Named Entity Recognition (NER) or Entity Linking. The DocRED-IE dataset also allows for the end-to-end tasks of: (i) DocIE and (ii) Joint Entity and Relation Extraction. DocRED-IE comprises sentence-level and document-level facts, thereby describing short as well as long-range interactions within an entire document.

Variants: DocRED-IE

Associated Benchmarks

This dataset is used in 4 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Coreference Resolution REXEL REXEL: An End-to-end Model for … 2024-04-19
Entity Typing REXEL REXEL: An End-to-end Model for … 2024-04-19
Entity Disambiguation REXEL REXEL: An End-to-end Model for … 2024-04-19
Document-level Relation Extraction REXEL REXEL: An End-to-end Model for … 2024-04-19

Research Papers

Recent papers with results on this dataset: