KP20k is a large-scale scholarly articles dataset with 528K articles for training, 20K articles for validation and 20K articles for testing.
Source: Keyphrase Prediction With Pre-trained Language Model
Image Source: https://arxiv.org/pdf/1704.06879.pdf
Variants: KP20k
This dataset is used in 3 benchmarks:
Recent papers with results on this dataset: