The AQUAINT Corpus consists of newswire text data in English, drawn from three sources: the Xinhua News Service (People's Republic of China), the New York Times News Service, and the Associated Press Worldstream News Service. It was prepared by the LDC for the AQUAINT Project, and will be used in official benchmark evaluations conducted by National Institute of Standards and Technology (NIST).
Source: Linguistic Data Consortium
Image Source: https://catalog.ldc.upenn.edu/LDC2002T31
Variants: AQUAINT
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Entity Disambiguation | KBED | Improving Entity Disambiguation by Reasoning … | 2022-07-08 |
Entity Disambiguation | ReFinED | ReFinED: An Efficient Zero-shot-capable Approach … | 2022-07-08 |
Entity Disambiguation | GENRE | Autoregressive Entity Retrieval | 2020-10-02 |
Entity Disambiguation | confidence-order | Global Entity Disambiguation with BERT | 2019-09-01 |
Entity Disambiguation | Global | Deep Joint Entity Disambiguation with … | 2017-04-17 |
Recent papers with results on this dataset: