24hBERT
|
How to Train BERT with an Academic Budget
|
70.70
|
2021-04-15
|
|
XLNet (single model)
|
XLNet: Generalized Autoregressive Pretraining for…
|
|
2019-06-19
|
|
DeBERTa (large)
|
DeBERTa: Decoding-enhanced BERT with Disentangled…
|
|
2020-06-05
|
|
ALBERT
|
ALBERT: A Lite BERT for Self-supervised Learning …
|
|
2019-09-26
|
|
T5-11B
|
Exploring the Limits of Transfer Learning with a …
|
|
2019-10-23
|
|
MLM+ subs+ del-span
|
CLEAR: Contrastive Learning for Sentence Represen…
|
|
2020-12-31
|
|
RoBERTa (ensemble)
|
RoBERTa: A Robustly Optimized BERT Pretraining Ap…
|
|
2019-07-26
|
|
ERNIE 2.0 Large
|
ERNIE 2.0: A Continual Pre-training Framework for…
|
|
2019-07-29
|
|
ELECTRA
|
ELECTRA: Pre-training Text Encoders as Discrimina…
|
|
2020-03-23
|
|
T5-Large 770M
|
Exploring the Limits of Transfer Learning with a …
|
|
2019-10-23
|
|
ERNIE 2.0 Base
|
ERNIE 2.0: A Continual Pre-training Framework for…
|
|
2019-07-29
|
|
T5-3B
|
Exploring the Limits of Transfer Learning with a …
|
|
2019-10-23
|
|
T5-Base
|
Exploring the Limits of Transfer Learning with a …
|
|
2019-10-23
|
|
RE2
|
Simple and Effective Text Matching with Richer Al…
|
|
2019-08-01
|
|
DistilBERT 66M
|
DistilBERT, a distilled version of BERT: smaller,…
|
|
2019-10-02
|
|
BigBird
|
Big Bird: Transformers for Longer Sequences
|
|
2020-07-28
|
|
T5-Small
|
Exploring the Limits of Transfer Learning with a …
|
|
2019-10-23
|
|
SWEM-concat
|
Baseline Needs More Love: On Simple Word-Embeddin…
|
|
2018-05-24
|
|
SqueezeBERT
|
SqueezeBERT: What can computer vision teach NLP a…
|
|
2020-06-19
|
|