ML Research Wiki / Benchmarks / Paraphrase Identification / Quora Question Pairs

Quora Question Pairs

Paraphrase Identification Benchmark

Performance Over Time

📊 Showing 29 results | 📏 Metric: F1

Top Performing Models

Rank	Model	Paper	F1	Date	Code
1	BERT-Base	Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning	9295.00	2020-12-22	📦 rabeehk/compacter 📦 anonymousxzy/compacter
2	data2vec	data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language	92.40	2022-02-07	📦 huggingface/transformers 📦 pytorch/fairseq 📦 AryanShekarlaban/data2vec-pytorch
3	Charformer-Tall	Charformer: Fast Character Transformers via Gradient-based Subword Tokenization	91.40	2021-06-23	📦 google-research/google-research 📦 lucidrains/charformer-pytorch
4	RealFormer	RealFormer: Transformer Likes Residual Attention	91.34	2020-12-21	📦 google-research/google-research 📦 cloneofsimo/RealFormer-pytorch 📦 jaketae/realformer 📦 JunnYu/x-transformers-paddle 📦 aivolcano/BERT_MRC_CLS
5	ALICE	SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization	90.70	2019-11-08	📦 namisan/mt-dnn 📦 microsoft/MT-DNN 📦 archinetai/smart-pytorch
6	StructBERTRoBERTa ensemble	StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding	90.70	2019-08-13	-
7	XLNet-Large (ensemble)	XLNet: Generalized Autoregressive Pretraining for Language Understanding	90.30	2019-06-19	📦 huggingface/transformers 📦 PaddlePaddle/PaddleNLP 📦 zihangdai/xlnet
8	Snorkel MeTaL(ensemble)	Training Complex Models with Multi-Task Weak Supervision	89.90	2018-10-05	📦 HazyResearch/metal
9	MT-DNN	Multi-Task Deep Neural Networks for Natural Language Understanding	89.60	2019-01-31	📦 namisan/mt-dnn 📦 xycforgithub/MultiTask-MRC 📦 ABaldrati/MT-BERT
10	SpanBERT	SpanBERT: Improving Pre-training by Representing and Predicting Spans	89.50	2019-07-24	📦 facebookresearch/SpanBERT 📦 mandarjoshi90/coref 📦 zixinzeng-jennifer/spanbert_trans

All Papers (29)

Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning

2020

BERT-Base

rabeehk/compacter anonymousxzy/compacter

data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language

2022

data2vec

huggingface/transformers pytorch/fairseq

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization

2021

Charformer-Tall

google-research/google-research lucidrains/charformer-pytorch

RealFormer: Transformer Likes Residual Attention

2020

RealFormer

google-research/google-research cloneofsimo/RealFormer-pytorch

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

2019

ALICE

namisan/mt-dnn microsoft/MT-DNN

StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding

2019

StructBERTRoBERTa ensemble

XLNet: Generalized Autoregressive Pretraining for Language Understanding

2019

XLNet-Large (ensemble)

huggingface/transformers PaddlePaddle/PaddleNLP

Training Complex Models with Multi-Task Weak Supervision

2018

Snorkel MeTaL(ensemble)

HazyResearch/metal

Multi-Task Deep Neural Networks for Natural Language Understanding

2019

MT-DNN

namisan/mt-dnn xycforgithub/MultiTask-MRC

SpanBERT: Improving Pre-training by Representing and Predicting Spans

2019

SpanBERT

facebookresearch/SpanBERT mandarjoshi90/coref

Entailment as Few-Shot Learner

2021

RoBERTa-large 355M + Entailment as Few-shot Learner

PaddlePaddle/PaddleNLP sunyilgdx/prompts4keras cactilab/hateguard

Simple and Effective Text Matching with Richer Alignment Features

2019

RE2

alibaba-edu/simple-effective-text-matching hitvoice/RE2 alibaba-edu/simple-effective-text-matching-pytorch

Natural Language Inference over Interaction Space

2017

DIIN

YichenGong/Densely-Interactive-Inference-Network YerevaNN/DIIN-in-Keras

Multi-task Sentence Encoding Model for Semantic Retrieval in Question Answering Systems

2019

MSEM

Cell-aware Stacked LSTMs for Modeling Sentences

2018

Bi-CAS-LSTM

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

2019

SMART-BERT

namisan/mt-dnn microsoft/MT-DNN

Neural Paraphrase Identification of Questions with Noisy Pretraining

2017

pt-DecAtt

TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding

2020

TRANS-BLSTM

Bilateral Multi-Perspective Matching for Natural Language Sentences

2017

BiMPM

google-research-datasets/paws zhiguowang/BiMPM

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

2018

GenSen

facebookresearch/InferSent facebookresearch/SentEval

FNet: Mixing Tokens with Fourier Transforms

2021

FNet-Large

labmlai/annotated_deep_learning_paper_implementations google-research/google-research

Self-Explaining Structures Improve NLP Models

2020

1-3[0.8pt/2pt] Random

ShannonAI/Self_Explaining_Structures_Improve_NLP_Models

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

2019

FreeLB

namisan/mt-dnn microsoft/MT-DNN

Adversarial Self-Attention for Language Understanding

2022

ASA + RoBERTa

gingasan/adversarialsa

Adversarial Self-Attention for Language Understanding

2022

ASA + BERT-base

gingasan/adversarialsa

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

2018

BERT-LARGE

huggingface/transformers tensorflow/models

TinyBERT: Distilling BERT for Natural Language Understanding

2019

TinyBERT

PaddlePaddle/PaddleNLP huawei-noah/Pretrained-Language-Model

ERNIE: Enhanced Language Representation with Informative Entities

2019

ERNIE

thunlp/ERNIE Mind23-2/MindCode-136

SplitEE: Early Exit in Deep Neural Networks with Split Computing

2023

SplitEE-S

Div290/SplitEE

Quora Question Pairs

Performance Over Time

Edit Benchmark Results

Edit Result

Top Performing Models

All Papers (29)

Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning

data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization

RealFormer: Transformer Likes Residual Attention

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Training Complex Models with Multi-Task Weak Supervision

Multi-Task Deep Neural Networks for Natural Language Understanding

SpanBERT: Improving Pre-training by Representing and Predicting Spans

Entailment as Few-Shot Learner

Simple and Effective Text Matching with Richer Alignment Features

Natural Language Inference over Interaction Space

Multi-task Sentence Encoding Model for Semantic Retrieval in Question Answering Systems

Cell-aware Stacked LSTMs for Modeling Sentences

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

Neural Paraphrase Identification of Questions with Noisy Pretraining

TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding

Bilateral Multi-Perspective Matching for Natural Language Sentences

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

FNet: Mixing Tokens with Fourier Transforms

Self-Explaining Structures Improve NLP Models

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

Adversarial Self-Attention for Language Understanding

Adversarial Self-Attention for Language Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

TinyBERT: Distilling BERT for Natural Language Understanding

ERNIE: Enhanced Language Representation with Informative Entities

SplitEE: Early Exit in Deep Neural Networks with Split Computing

Model	Paper	F1	Date
BERT-Base	Intrinsic Dimensionality Explains the Effectivene…	9295.00	2020-12-22
data2vec	data2vec: A General Framework for Self-supervised…	92.40	2022-02-07
Charformer-Tall	Charformer: Fast Character Transformers via Gradi…	91.40	2021-06-23
RealFormer	RealFormer: Transformer Likes Residual Attention	91.34	2020-12-21
ALICE	SMART: Robust and Efficient Fine-Tuning for Pre-t…	90.70	2019-11-08
StructBERTRoBERTa ensemble	StructBERT: Incorporating Language Structures int…	90.70	2019-08-13
XLNet-Large (ensemble)	XLNet: Generalized Autoregressive Pretraining for…	90.30	2019-06-19
Snorkel MeTaL(ensemble)	Training Complex Models with Multi-Task Weak Supe…	89.90	2018-10-05
MT-DNN	Multi-Task Deep Neural Networks for Natural Langu…	89.60	2019-01-31
SpanBERT	SpanBERT: Improving Pre-training by Representing …	89.50	2019-07-24
RoBERTa-large 355M + Entailment as Few-shot Learner	Entailment as Few-Shot Learner	89.20	2021-04-29
RE2	Simple and Effective Text Matching with Richer Al…	89.20	2019-08-01
DIIN	Natural Language Inference over Interaction Space	89.06	2017-09-13
MSEM	Multi-task Sentence Encoding Model for Semantic R…	88.86	2019-11-18
Bi-CAS-LSTM	Cell-aware Stacked LSTMs for Modeling Sentences	88.60	2018-09-07
SMART-BERT	SMART: Robust and Efficient Fine-Tuning for Pre-t…	88.50	2019-11-08
pt-DecAtt	Neural Paraphrase Identification of Questions wit…	88.40	2017-04-15
TRANS-BLSTM	TRANS-BLSTM: Transformer with Bidirectional LSTM …	88.28	2020-03-16
BiMPM	Bilateral Multi-Perspective Matching for Natural …	88.17	2017-02-13
GenSen	Learning General Purpose Distributed Sentence Rep…	87.01	2018-03-30
FNet-Large	FNet: Mixing Tokens with Fourier Transforms	85.00	2021-05-09
1-3[0.8pt/2pt] Random	Self-Explaining Structures Improve NLP Models	80.00	2020-12-03
FreeLB	SMART: Robust and Efficient Fine-Tuning for Pre-t…	74.80	2019-11-08
ASA + RoBERTa	Adversarial Self-Attention for Language Understan…	73.70	2022-06-25
ASA + BERT-base	Adversarial Self-Attention for Language Understan…	72.30	2022-06-25
BERT-LARGE	BERT: Pre-training of Deep Bidirectional Transfor…	72.10	2018-10-11
TinyBERT	TinyBERT: Distilling BERT for Natural Language Un…	71.30	2019-09-23
ERNIE	ERNIE: Enhanced Language Representation with Info…	71.20	2019-05-17
SplitEE-S	SplitEE: Early Exit in Deep Neural Networks with …	71.10	2023-09-17