MRPC

Microsoft Research Paraphrase Corpus

Dataset Information
Modalities
Texts
Languages
English
License
Unknown
Homepage

Overview

Microsoft Research Paraphrase Corpus (MRPC) is a corpus consists of 5,801 sentence pairs collected from newswire articles. Each pair is labelled if it is a paraphrase or not by human annotators. The whole set is divided into a training subset (4,076 sentence pairs of which 2,753 are paraphrases) and a test subset (1,725 pairs of which 1,147 are paraphrases).

Source: Exploiting Semantic Annotations and Q-Learning for Constructing an Efficient Hierarchy/Graph Texts Organization
Image Source: https://www.aclweb.org/anthology/I05-5002.pdf

Variants: MRPC, MRPC Dev

Associated Benchmarks

This dataset is used in 3 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Semantic Textual Similarity RoBERTa + SubRegWeigh (K-means) SubRegWeigh: Effective and Efficient Annotation … 2024-09-10
Semantic Textual Similarity RoBERTa-large 355M (MLP quantized vector-wise, fine-tuned) LLM.int8(): 8-bit Matrix Multiplication for … 2022-08-15
Natural Language Inference DeBERTaV3large DeBERTaV3: Improving DeBERTa using ELECTRA-Style … 2021-11-18
Few-Shot Learning DART Differentiable Prompt Makes Pre-trained Language … 2021-08-30
Semantic Textual Similarity AutoBERT-Zero (Base) AutoBERT-Zero: Evolving BERT Backbone from … 2021-07-15
Semantic Textual Similarity AutoBERT-Zero (Large) AutoBERT-Zero: Evolving BERT Backbone from … 2021-07-15
Semantic Textual Similarity Charformer-Tall Charformer: Fast Character Transformers via … 2021-06-23
Semantic Textual Similarity FNet-Large FNet: Mixing Tokens with Fourier … 2021-05-09
Semantic Textual Similarity RoBERTa-large 355M + Entailment as Few-shot Learner Entailment as Few-Shot Learner 2021-04-29
Semantic Textual Similarity 24hBERT How to Train BERT with … 2021-04-15
Semantic Textual Similarity Nyströmformer Nyströmformer: A Nyström-Based Algorithm for … 2021-02-07
Semantic Textual Similarity MLM+ del-word+ reorder CLEAR: Contrastive Learning for Sentence … 2020-12-31
Semantic Textual Similarity BERT-Base Intrinsic Dimensionality Explains the Effectiveness … 2020-12-22
Semantic Textual Similarity BERT-Large Intrinsic Dimensionality Explains the Effectiveness … 2020-12-22
Semantic Textual Similarity RealFormer RealFormer: Transformer Likes Residual Attention 2020-12-21
Semantic Textual Similarity PSQ (Chen et al., 2020) A Statistical Framework for Low-bitwidth … 2020-10-27
Semantic Textual Similarity BigBird Big Bird: Transformers for Longer … 2020-07-28
Semantic Textual Similarity SqueezeBERT SqueezeBERT: What can computer vision … 2020-06-19
Semantic Textual Similarity MobileBERT MobileBERT: a Compact Task-Agnostic BERT … 2020-04-06
Semantic Textual Similarity FLOATER-large Learning to Encode Position for … 2020-03-13

Research Papers

Recent papers with results on this dataset: