The Newsela dataset was introduced by Xu et al. in their research on text simplification. It is a corpus that includes thousands of news articles professionally leveled to different reading complexities. The dataset is used for academic research in fields such as text difficulty and text simplification. It is made available to academic partners upon request. The dataset is often used as a benchmark in the field of text simplification. Please note that the Newsela dataset is different from the NELA datasets, which are collections of news articles for the study of media bias and other applications.
Variants: Newsela
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Text Simplification | Edit-Unsup-TS | Iterative Edit-Based Unsupervised Sentence Simplification | 2020-06-17 |
Text Simplification | CRF Alignment + Transformer | Neural CRF Model for Sentence … | 2020-05-05 |
Text Simplification | EditNTS | EditNTS: An Neural Programmer-Interpreter Model … | 2019-06-19 |
Text Simplification | S2S-Cluster-FA | Complexity-Weighted Loss and Diverse Reranking … | 2019-04-04 |
Text Simplification | DMASS + DCSS | Integrating Transformer and Paraphrase Rules … | 2018-10-26 |
Text Simplification | Pointer + Multi-task Entailment and Paraphrase Generation | Dynamic Multi-Level Multi-Task Learning for … | 2018-06-19 |
Text Simplification | NSELSTM-S | Sentence Simplification with Memory-Augmented Neural … | 2018-04-20 |
Text Simplification | NSELSTM-B | Sentence Simplification with Memory-Augmented Neural … | 2018-04-20 |
Text Simplification | DRESS | Sentence Simplification with Deep Reinforcement … | 2017-03-31 |
Text Simplification | DRESS-LS | Sentence Simplification with Deep Reinforcement … | 2017-03-31 |
Recent papers with results on this dataset: