Dataset for text relevance ranking in Czech
DaReCzech is a dataset for text relevance ranking in Czech. The dataset consists of more than 1.6M annotated query-documents pairs, which makes it one of the largest available datasets for this task.
Please, first read a disclaimer that contains the terms of use. If you comply with them, send an email to [email protected] and the link to the dataset will be sent to you.
Variants: DaReCzech
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Document Ranking | Query-doc RobeCzech (Roberta-base) | Siamese BERT-based Model for Web … | 2021-12-03 |
Document Ranking | Query-doc Small-E-Czech (Electra-small) | Siamese BERT-based Model for Web … | 2021-12-03 |
Document Ranking | Siamese Small-E-Czech (Electra-small) | Siamese BERT-based Model for Web … | 2021-12-03 |
Recent papers with results on this dataset: