BERT-Base
|
Intrinsic Dimensionality Explains the Effectivene…
|
9295.00
|
2020-12-22
|
|
data2vec
|
data2vec: A General Framework for Self-supervised…
|
92.40
|
2022-02-07
|
|
Charformer-Tall
|
Charformer: Fast Character Transformers via Gradi…
|
91.40
|
2021-06-23
|
|
RealFormer
|
RealFormer: Transformer Likes Residual Attention
|
91.34
|
2020-12-21
|
|
ALICE
|
SMART: Robust and Efficient Fine-Tuning for Pre-t…
|
90.70
|
2019-11-08
|
|
StructBERTRoBERTa ensemble
|
StructBERT: Incorporating Language Structures int…
|
90.70
|
2019-08-13
|
|
XLNet-Large (ensemble)
|
XLNet: Generalized Autoregressive Pretraining for…
|
90.30
|
2019-06-19
|
|
Snorkel MeTaL(ensemble)
|
Training Complex Models with Multi-Task Weak Supe…
|
89.90
|
2018-10-05
|
|
MT-DNN
|
Multi-Task Deep Neural Networks for Natural Langu…
|
89.60
|
2019-01-31
|
|
SpanBERT
|
SpanBERT: Improving Pre-training by Representing …
|
89.50
|
2019-07-24
|
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
Entailment as Few-Shot Learner
|
89.20
|
2021-04-29
|
|
RE2
|
Simple and Effective Text Matching with Richer Al…
|
89.20
|
2019-08-01
|
|
DIIN
|
Natural Language Inference over Interaction Space
|
89.06
|
2017-09-13
|
|
MSEM
|
Multi-task Sentence Encoding Model for Semantic R…
|
88.86
|
2019-11-18
|
|
Bi-CAS-LSTM
|
Cell-aware Stacked LSTMs for Modeling Sentences
|
88.60
|
2018-09-07
|
|
SMART-BERT
|
SMART: Robust and Efficient Fine-Tuning for Pre-t…
|
88.50
|
2019-11-08
|
|
pt-DecAtt
|
Neural Paraphrase Identification of Questions wit…
|
88.40
|
2017-04-15
|
|
TRANS-BLSTM
|
TRANS-BLSTM: Transformer with Bidirectional LSTM …
|
88.28
|
2020-03-16
|
|
BiMPM
|
Bilateral Multi-Perspective Matching for Natural …
|
88.17
|
2017-02-13
|
|
GenSen
|
Learning General Purpose Distributed Sentence Rep…
|
87.01
|
2018-03-30
|
|
FNet-Large
|
FNet: Mixing Tokens with Fourier Transforms
|
85.00
|
2021-05-09
|
|
1-3[0.8pt/2pt] Random
|
Self-Explaining Structures Improve NLP Models
|
80.00
|
2020-12-03
|
|
FreeLB
|
SMART: Robust and Efficient Fine-Tuning for Pre-t…
|
74.80
|
2019-11-08
|
|
ASA + RoBERTa
|
Adversarial Self-Attention for Language Understan…
|
73.70
|
2022-06-25
|
|
ASA + BERT-base
|
Adversarial Self-Attention for Language Understan…
|
72.30
|
2022-06-25
|
|
BERT-LARGE
|
BERT: Pre-training of Deep Bidirectional Transfor…
|
72.10
|
2018-10-11
|
|
TinyBERT
|
TinyBERT: Distilling BERT for Natural Language Un…
|
71.30
|
2019-09-23
|
|
ERNIE
|
ERNIE: Enhanced Language Representation with Info…
|
71.20
|
2019-05-17
|
|
SplitEE-S
|
SplitEE: Early Exit in Deep Neural Networks with …
|
71.10
|
2023-09-17
|
|