GDA

Gene-Disease Associations Corpus

Dataset Information
Introduced
2019
Homepage

Overview

The gene-disease associations corpus contains 30,192 titles and abstracts from PubMed articles that have been automatically labelled for genes, diseases and gene-disease associations via distant supervision. The test set is comprised of 1000 of these examples. It is common to hold out a random 20% of the examples in the train set as a validation set.

Variants: GDA

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Relation Extraction DRE-MIR-SciBERT A Masked Image Reconstruction Network … 2022-04-21
Relation Extraction seq2rel (entity hinting) A sequence-to-sequence approach for document-level … 2022-04-03
Relation Extraction Dense-CCNet-SciBERTbase A Densely Connected Criss-Cross Attention … 2022-03-26
Relation Extraction CGM2IR-SciBERTbase Document-level Relation Extraction with Context … 2022-01-13
Relation Extraction SAISORE+CR+ET-SciBERT SAIS: Supervising and Augmenting Intermediate … 2021-09-24
Relation Extraction DocuNet-SciBERTbase Document-level Relation Extraction as Semantic … 2021-06-07
Relation Extraction SSANBiaffine Entity Structure Within and Throughout: … 2021-02-20
Relation Extraction SciBERT-ATLOPBASE Document-Level Relation Extraction with Adaptive … 2020-10-21
Relation Extraction LSR w/o MDP Nodes Reasoning with Latent Structure Refinement … 2020-05-13

Research Papers

Recent papers with results on this dataset: