QC-Science contains 47832 question-answer pairs belonging to the science domain tagged with labels of the form subject - chapter - topic. The dataset was collected with the help of a leading e-learning platform. The dataset consists of 40895 samples for training, 2153 samples for validation and 4784 samples for testing.
Description adopted from: https://arxiv.org/pdf/2107.10649v1.pdf
Image source: https://arxiv.org/pdf/2107.10649v1.pdf
Variants: QC-Science
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Question-Answer categorization | TagRec(BERT+USE) | TagRec: Automated Tagging of Questions … | 2021-07-03 |
Question-Answer categorization | TagRec(BERT+Sent BERT) | TagRec: Automated Tagging of Questions … | 2021-07-03 |
Question-Answer categorization | BERT+sent2vec | TagRec: Automated Tagging of Questions … | 2021-07-03 |
Question-Answer categorization | BERT+GloVe | TagRec: Automated Tagging of Questions … | 2021-07-03 |
Question-Answer categorization | Twin BERT | TagRec: Automated Tagging of Questions … | 2021-07-03 |
Question-Answer categorization | Pretrained Sent BERT | TagRec: Automated Tagging of Questions … | 2021-07-03 |
Recent papers with results on this dataset: