Wikidata5M

Dataset Information
Introduced
2019
License
Unknown
Homepage

Overview

Wikidata5m is a million-scale knowledge graph dataset with aligned corpus. This dataset integrates the Wikidata knowledge graph and Wikipedia pages. Each entity in Wikidata5m is described by a corresponding Wikipedia page, which enables the evaluation of link prediction over unseen entities.

The dataset is distributed as a knowledge graph, a corpus, and aliases. We provide both transductive and inductive data splits used in the original paper.

Variants: Wikidata5M

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Link Prediction KGT5 + Description Friendly Neighbors: Contextualized Sequence-to-Sequence Link … 2023-05-22
Link Prediction KGT5-context Friendly Neighbors: Contextualized Sequence-to-Sequence Link … 2023-05-22
Link Prediction KGT5-context + Description Friendly Neighbors: Contextualized Sequence-to-Sequence Link … 2023-05-22
Link Prediction KGT5 ComplEx Ensemble Sequence-to-Sequence Knowledge Graph Completion and … 2022-03-19
Link Prediction KGT5 Sequence-to-Sequence Knowledge Graph Completion and … 2022-03-19
Link Prediction SimKGC + Description SimKGC: Simple Contrastive Knowledge Graph … 2022-03-04
Link Prediction KEPLER-Wiki-rel KEPLER: A Unified Model for … 2019-11-13
Link Prediction SimplE KEPLER: A Unified Model for … 2019-11-13
Link Prediction RotatE KEPLER: A Unified Model for … 2019-11-13
Link Prediction ComplEx KEPLER: A Unified Model for … 2019-11-13
Link Prediction TransE KEPLER: A Unified Model for … 2019-11-13
Link Prediction DistMult KEPLER: A Unified Model for … 2019-11-13

Research Papers

Recent papers with results on this dataset: