Pubmed

Dataset Information
Modalities
Graphs
Introduced
2008
License
Unknown
Homepage

Overview

The PubMed dataset consists of 19717 scientific publications from PubMed database pertaining to diabetes classified into one of three classes. The citation network consists of 44338 links. Each publication in the dataset is described by a TF/IDF weighted word vector from a dictionary which consists of 500 unique words.

Source: https://linqs.soe.ucsc.edu/data

Variants: PubMed-Long Val, PubMed-Long Test, Pubmed: fixed 20 node per class, PubMed Central, PubMed (60%/20%/20% random splits), PubMed (48%/32%/20% fixed splits), keyword_pubmed_dataset, enoriega/keyword_pubmed sentence, Pubmed (weighted evaluation), Pubmed random partition, Pubmed (nonstandard variant), Pubmed (biased evaluation), PubMed 20k RCT, Pubmed Full-supervised, Pubmed, PubMed with Public Split: fixed 20 nodes per class, PubMed (0.1%), PubMed (0.05%), PubMed (0.03%)

Associated Benchmarks

This dataset is used in 6 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Node Classification FIT-GNN FIT-GNN: Faster Inference Time for … 2024-10-19
Text Summarization eyeglaxs Scaling Up Summarization: Leveraging Large … 2024-08-28
Node Classification CGT Mitigating Degree Biases in Message … 2023-12-28
Community Detection CDNMF Contrastive Deep Nonnegative Matrix Factorization … 2023-11-04
Node Classification Graph-MLP + SAF The Split Matters: Flat Minima … 2023-06-15
Node Classification NCSAGE Clarify Confused Nodes via Separated … 2023-06-04
Node Classification NCGCN Clarify Confused Nodes via Separated … 2023-06-04
Link Prediction NESS NESS: Node Embeddings from Static … 2023-03-15
Text Summarization GoSum (extractive) GoSum: Extractive Summarization of Long … 2022-11-18
Text Summarization Lodoss-full-base (extractive) Toward Unifying Text Segmentation and … 2022-10-28
Text Summarization Lodoss-full-large (extractive) Toward Unifying Text Segmentation and … 2022-10-28
Text Summarization BART-LS Adapting Pretrained Text-to-Text Models for … 2022-09-21
Text Summarization GRETEL GRETEL: Graph Contrastive Topic Enhanced … 2022-08-21
Node Classification TREE-G TREE-G: Decision Trees Contesting Graph … 2022-07-06
Node Classification CT-Layer DiffWire: Inductive Graph Rewiring via … 2022-06-15
Node Classification CT-Layer (PE) DiffWire: Inductive Graph Rewiring via … 2022-06-15
Text Summarization FactorSum Factorizing Content and Budget Decisions … 2022-05-25
Text Summarization HiStruct+ HiStruct+: Improving Extractive Text Summarization … 2022-03-17
Text Summarization Top Down Transformer (AdaPool) (464M) Long Document Summarization with Top-down … 2022-03-15
Node Classification PairE Graph Representation Learning Beyond Node … 2022-03-03

Research Papers

Recent papers with results on this dataset: