Lipogram-e

Dataset Information
Modalities
Texts
Languages
English
Introduced
2022
License
MIT
Homepage

Overview

This is a dataset of 3 English books which do not contain the letter "e" in them. This dataset includes all of "Gadsby" by Ernest Vincent Wright, all of "A Void" by Georges Perec, and almost all of "Eunoia" by Christian Bok (except for the single chapter that uses the letter "e" in it)

This dataset is contributed as part of a paper titled "Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio" to appear at COLING 2022.

This dataset and the works within them are examples of Lipograms, which are works where a letter or string is systematically omitted. Lipograms are an example of hard-constrained writing.

Variants: Lipogram-e

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Conditional Text Generation GPT-2-with-filter Most Language Models can be … 2023-06-28
Conditional Text Generation GPT-2-fine-tuned-20-epochs Most Language Models can be … 2023-06-28
Conditional Text Generation GPT-2-fine-tuned-5-epochs Most Language Models can be … 2023-06-28
Conditional Text Generation GPT-2-no-fine-tuning Most Language Models can be … 2023-06-28

Research Papers

Recent papers with results on this dataset: