BC2GM

Dataset Information
Modalities
Texts
License
Unknown
Homepage

Overview

Created by Smith et al. at 2008, the BioCreative II Gene Mention Recognition (BC2GM) Dataset contains data where participants are asked to identify a gene mention in a sentence by giving its start and end characters. The training set consists of a set of sentences, and for each sentence a set of gene mentions (GENE annotations). [registration required for access], in English language. Containing 20 in n/a file format.

Variants: BC2GM

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Named Entity Recognition (NER) UniNER-7B UniversalNER: Targeted Distillation from Large … 2023-08-07
Named Entity Recognition (NER) BioDistilBERT On the Effectiveness of Compact … 2022-09-07
Named Entity Recognition (NER) CompactBioBERT On the Effectiveness of Compact … 2022-09-07
Named Entity Recognition (NER) DistilBioBERT On the Effectiveness of Compact … 2022-09-07
Named Entity Recognition (NER) BioMobileBERT On the Effectiveness of Compact … 2022-09-07
Named Entity Recognition (NER) HGN Hero-Gang Neural Model For Named … 2022-05-15
Named Entity Recognition (NER) BioLinkBERT (large) LinkBERT: Pretraining Language Models with … 2022-03-29
Named Entity Recognition (NER) BERN2 BERN2: an advanced neural biomedical … 2022-01-06
Named Entity Recognition (NER) KeBioLM Improving Biomedical Pretrained Language Models … 2021-04-21
Named Entity Recognition (NER) Spark NLP Biomedical Named Entity Recognition at … 2020-11-12
Named Entity Recognition (NER) PubMedBERT uncased Domain-Specific Language Model Pretraining for … 2020-07-31

Research Papers

Recent papers with results on this dataset: