DWIE

Deutsche Welle corpus for Information Extraction

Dataset Information
Modalities
Texts
Languages
English
License
GPL-3.0 License
Homepage

Overview

The 'Deutsche Welle corpus for Information Extraction' (DWIE) is a multi-task dataset that combines four main Information Extraction (IE) annotation sub-tasks: (i) Named Entity Recognition (NER), (ii) Coreference Resolution, (iii) Relation Extraction (RE), and (iv) Entity Linking. DWIE is conceived as an entity-centric dataset that describes interactions and properties of conceptual entities on the level of the complete document.

Source: https://arxiv.org/abs/2009.12626

Variants: DWIE

Associated Benchmarks

This dataset is used in 4 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Document-level Relation Extraction VaeDiff-DocRE VaeDiff-DocRE: End-to-end Data Augmentation Framework … 2024-12-18
Relation Extraction REXEL REXEL: An End-to-end Model for … 2024-04-19
Named Entity Recognition (NER) REXEL REXEL: An End-to-end Model for … 2024-04-19
Coreference Resolution REXEL REXEL: An End-to-end Model for … 2024-04-19
Named Entity Recognition (NER) KB-both Injecting Knowledge Base Information into … 2021-07-05
Relation Extraction KB-both Injecting Knowledge Base Information into … 2021-07-05
Coreference Resolution KB-both Injecting Knowledge Base Information into … 2021-07-05
Named Entity Recognition (NER) Joint+RelProp DWIE: an entity-centric dataset for … 2020-09-26
Coreference Resolution Joint DWIE: an entity-centric dataset for … 2020-09-26
Relation Extraction Joint+AttProp DWIE: an entity-centric dataset for … 2020-09-26

Research Papers

Recent papers with results on this dataset: