LitBank is an annotated dataset of 100 works of English-language fiction to support tasks in natural language processing and the computational humanities, described in more detail in the following publications:
LitBank currently contains annotations for entities, events, entity coreference, and quotation attribution in a sample of ~2,000 words from each of those texts, totaling 210,532 tokens.
LitBank is licensed under a Creative Commons Attribution 4.0 International License.
Variants: LitBank
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Coreference Resolution | Maverick_incr | Maverick: Efficient and Accurate Coreference … | 2024-07-31 |
Coreference Resolution | longdoc S (OntoNotes + PreCo + LitBank) | On Generalization in Coreference Resolution | 2021-09-20 |
Recent papers with results on this dataset: