CHIP-STS

Semantic Textual Similarity Dataset

Dataset Information
Modalities
Texts
Languages
Chinese
Introduced
2021
License
Unknown
Homepage

Overview

CHIP Semantic Textual Similarity, a dataset for sentence similarity in the non-i.i.d.
(non-independent and identically distributed) setting, is used for the CHIP-STS task. Specifically, the
task aims to transfer learning between disease types on Chinese disease questions and answer data.
Given question pairs related to 5 different diseases (The disease types in the training and testing set
are different), the task intends to determine whether the semantics of the two sentences are similar.

Variants: CHIP-STS

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Semantic Similarity MacBERT-large CBLUE: A Chinese Biomedical Language … 2021-06-15

Research Papers

Recent papers with results on this dataset: