CDR

BioCreative V CDR Task Corpus

Dataset Information
Modalities
Texts
Languages
English
Introduced
2016
License
Unknown
Homepage

Overview

The BioCreative V CDR task corpus is manually annotated for chemicals, diseases and chemical-induced disease (CID) relations. It contains the titles and abstracts of 1500 PubMed articles and is split into equally sized train, validation and test sets. It is common to first tune a model on the validation set and then train on the combination of the train and validation sets before evaluating on the test set. It is also common to filter negative relations with disease entities that are hypernyms of a corresponding true relations disease entity within the same abstract (see Appendix C of this paper for details).

Variants: CDR

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Relation Extraction DRE-MIR-SciBERT A Masked Image Reconstruction Network … 2022-04-21
Relation Extraction seq2rel (entity hinting) A sequence-to-sequence approach for document-level … 2022-04-03
Relation Extraction Dense-CCNet-SciBERTbase A Densely Connected Criss-Cross Attention … 2022-03-26
Relation Extraction CGM2IR-SciBERTbase Document-level Relation Extraction with Context … 2022-01-13
Relation Extraction SAISORE+CR+ET-SciBERT SAIS: Supervising and Augmenting Intermediate … 2021-09-24
Relation Extraction DocuNet-SciBERTbase Document-level Relation Extraction as Semantic … 2021-06-07
Relation Extraction SSANBiaffine Entity Structure Within and Throughout: … 2021-02-20
Relation Extraction SciBERT-ATLOPBASE Document-Level Relation Extraction with Adaptive … 2020-10-21
Relation Extraction LSR w/o MDP Nodes Reasoning with Latent Structure Refinement … 2020-05-13

Research Papers

Recent papers with results on this dataset: