Klexikon

Name: Klexikon
Published: 2022-01-18
License: CC-BY-SA

Klexikon: A German Dataset for Joint Summarization and Simplification

Dataset Information

Modalities

Texts

Languages

German

Introduced

2022

License

CC-BY-SA

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

The dataset introduces document alignments between German Wikipedia and the children's lexicon Klexikon.
The source texts in Wikipedia are both written in a more complex language than Klexikon, and also significantly longer, which makes this a suitable application for both summarization and simplification.
In fact, previous research has so far only focused on either of the two, but not comprehensively been studied as a joint task.

Variants: Klexikon

Associated Benchmarks

This dataset is used in 1 benchmark:

Text Summarization - Metrics: ROUGE-1, ROUGE-2, ROUGE-L

Recent Benchmark Submissions

Task	Model	Paper	Date
Text Summarization	Luhn's algorithm (25 sentences)	Klexikon: A German Dataset for …	2022-01-18
Text Summarization	Lead-k	Klexikon: A German Dataset for …	2022-01-18
Text Summarization	Lead-3	Klexikon: A German Dataset for …	2022-01-18
Text Summarization	Full article	Klexikon: A German Dataset for …	2022-01-18

Research Papers

Recent papers with results on this dataset:

Klexikon: A German Dataset for Joint Summarization and Simplification (2022) -

External Links:

Klexikon

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview