The CiteSeer dataset consists of 3312 scientific publications classified into one of six classes. The citation network consists of 4732 links. Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of 3703 unique words.
Source: https://linqs.soe.ucsc.edu/data
Variants: Citeseer (weighted evaluation), Citeseer random partition, Citeseer Full-supervised, Citeseer (nonstandard variant), Citeseer (biased evaluation), Citeseer, CiteSeer with Public Split: fixed 5 nodes per class, CiteSeer with Public Split: fixed 20 nodes per class, CiteSeer (1%), CiteSeer (0.5%)
This dataset is used in 5 benchmarks:
Recent papers with results on this dataset: