BIOSSES

Biomedical Semantic Similarity Estimation System

Dataset Information
Modalities
Texts, Medical
Introduced
2017
License
Unknown
Homepage

Overview

The BIOSSES data set comprises total 100 sentence pairs all of which were selected from the "TAC2 Biomedical Summarization Track Training Data Set" .

The sentence pairs were evaluated by five different human experts that judged their similarity and gave scores in a range [0-4]. Our guideline was prepared based on SemEval 2012 Task 6 Guideline.

Image source: BIOSSES

Variants: BIOSSES

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Semantic Similarity BioLinkBERT (large) LinkBERT: Pretraining Language Models with … 2022-03-29
Semantic Similarity BioLinkBERT (base) LinkBERT: Pretraining Language Models with … 2022-03-29
Semantic Similarity NCBI_BERT(base) (P+M) Transfer Learning in Biomedical Natural … 2019-06-13

Research Papers

Recent papers with results on this dataset: