Restoration-200k for Response Selection
Train | Validation | Test | Ranking Test | |
---|---|---|---|---|
size | 0.4M | 50K | 5K | 800 |
pos:neg | 1:1 | 1:9 | 1.2:8.8 | - |
avg turns | 5.0 | 5.0 | 5.0 | 5.0 |
Ranking test set contains the high-quality responses that selected by some baselines, and their correlation with the conversation context are carefully annotated by 8 professional annotators (the average annotation scores are saved for ranking). For ranking test set, the metrics should be NDCG@3 and NDCG@5, since the correlation scores are provided. More details are available in the Appendix of the paper.
Variants: RRS
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Conversational Response Selection | SA-BERT+HCL | Dialogue Response Selection with Hierarchical … | 2020-12-29 |
Conversational Response Selection | SA-BERT+BERT-FP | Speaker-Aware BERT for Multi-Turn Response … | 2020-04-07 |
Conversational Response Selection | BERT | An Effective Domain Adaptive Post-Training … | 2019-08-13 |
Conversational Response Selection | SMN | Sequential Matching Network: A New … | 2016-12-06 |
Recent papers with results on this dataset: