Klexikon

Klexikon: A German Dataset for Joint Summarization and Simplification

Dataset Information
Modalities
Texts
Languages
German
Introduced
2022
License
CC-BY-SA
Homepage

Overview

The dataset introduces document alignments between German Wikipedia and the children's lexicon Klexikon.
The source texts in Wikipedia are both written in a more complex language than Klexikon, and also significantly longer, which makes this a suitable application for both summarization and simplification.
In fact, previous research has so far only focused on either of the two, but not comprehensively been studied as a joint task.

Variants: Klexikon

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Text Summarization Luhn's algorithm (25 sentences) Klexikon: A German Dataset for … 2022-01-18
Text Summarization Lead-k Klexikon: A German Dataset for … 2022-01-18
Text Summarization Lead-3 Klexikon: A German Dataset for … 2022-01-18
Text Summarization Full article Klexikon: A German Dataset for … 2022-01-18

Research Papers

Recent papers with results on this dataset: