The PubMed dataset consists of 19717 scientific publications from PubMed database pertaining to diabetes classified into one of three classes. The citation network consists of 44338 links. Each publication in the dataset is described by a TF/IDF weighted word vector from a dictionary which consists of 500 unique words.
Source: https://linqs.soe.ucsc.edu/data
Variants: PubMed-Long Val, PubMed-Long Test, Pubmed: fixed 20 node per class, PubMed Central, PubMed (60%/20%/20% random splits), PubMed (48%/32%/20% fixed splits), keyword_pubmed_dataset, enoriega/keyword_pubmed sentence, Pubmed (weighted evaluation), Pubmed random partition, Pubmed (nonstandard variant), Pubmed (biased evaluation), PubMed 20k RCT, Pubmed Full-supervised, Pubmed, PubMed with Public Split: fixed 20 nodes per class, PubMed (0.1%), PubMed (0.05%), PubMed (0.03%)
This dataset is used in 6 benchmarks:
Recent papers with results on this dataset: