UnitedSynT5 (3B)
|
First Train to Generate, then Generate to Train: …
|
94.70
|
2024-12-12
|
|
UnitedSynT5 (335M)
|
First Train to Generate, then Generate to Train: …
|
93.50
|
2024-12-12
|
|
Neural Tree Indexers for Text Understanding
|
Entailment as Few-Shot Learner
|
93.10
|
2021-04-29
|
|
EFL (Entailment as Few-shot Learner) + RoBERTa-large
|
Entailment as Few-Shot Learner
|
93.10
|
2021-04-29
|
|
MT-DNN-SMARTLARGEv0
|
SMART: Robust and Efficient Fine-Tuning for Pre-t…
|
92.60
|
2019-11-08
|
|
RoBERTa-large+Self-Explaining
|
Self-Explaining Structures Improve NLP Models
|
92.30
|
2020-12-03
|
|
RoBERTa-large + self-explaining layer
|
Self-Explaining Structures Improve NLP Models
|
92.30
|
2020-12-03
|
|
CA-MTL
|
Conditionally Adaptive Multi-Task Learning: Impro…
|
92.10
|
2020-09-19
|
|
SemBERT
|
Semantics-aware BERT for Language Understanding
|
91.90
|
2019-09-05
|
|
MT-DNN
|
Multi-Task Deep Neural Networks for Natural Langu…
|
91.60
|
2019-01-31
|
|
MT-DNN-SMART_100%ofTrainingData
|
SMART: Robust and Efficient Fine-Tuning for Pre-t…
|
91.60
|
2019-11-08
|
|
SJRC (BERT-Large +SRL)
|
Explicit Contextual Semantics for Text Comprehens…
|
91.30
|
2018-09-08
|
|
Ntumpha
|
Multi-Task Deep Neural Networks for Natural Langu…
|
90.50
|
2019-01-31
|
|
Densely-Connected Recurrent and Co-Attentive Network Ensemble
|
Semantic Sentence Matching with Densely-connected…
|
90.10
|
2018-05-29
|
|
300D DMAN Ensemble
|
Discourse Marker Augmented Network with Reinforce…
|
89.60
|
2019-07-23
|
|
450D DR-BiLSTM Ensemble
|
DR-BiLSTM: Dependent Reading Bidirectional LSTM f…
|
89.30
|
2018-02-15
|
|
300D CAFE Ensemble
|
Compare, Compress and Propagate: Enhancing Neural…
|
89.30
|
2017-12-30
|
|
ESIM + ELMo Ensemble
|
Deep contextualized word representations
|
89.30
|
2018-02-15
|
|
KIM Ensemble
|
Neural Natural Language Inference Models Enhanced…
|
89.10
|
2017-11-12
|
|
SLRC
|
Explicit Contextual Semantics for Text Comprehens…
|
89.10
|
2018-09-08
|
|
RE2
|
Simple and Effective Text Matching with Richer Al…
|
88.90
|
2019-08-01
|
|
Densely-Connected Recurrent and Co-Attentive Network
|
Semantic Sentence Matching with Densely-connected…
|
88.90
|
2018-05-29
|
|
DEIM
|
DEIM: An effective deep encoding and interaction …
|
88.90
|
2022-03-20
|
|
448D Densely Interactive Inference Network (DIIN, code) Ensemble
|
Natural Language Inference over Interaction Space
|
88.90
|
2017-09-13
|
|
300D DMAN
|
Discourse Marker Augmented Network with Reinforce…
|
88.80
|
2019-07-23
|
|
BiMPM Ensemble
|
Bilateral Multi-Perspective Matching for Natural …
|
88.80
|
2017-02-13
|
|
ESIM + ELMo
|
Deep contextualized word representations
|
88.70
|
2018-02-15
|
|
MT-DNN-SMART_10%ofTrainingData
|
SMART: Robust and Efficient Fine-Tuning for Pre-t…
|
88.70
|
2019-11-08
|
|
KIM
|
Neural Natural Language Inference Models Enhanced…
|
88.60
|
2017-11-12
|
|
600D ESIM + 300D Syntactic TreeLSTM
|
Enhanced LSTM for Natural Language Inference
|
88.60
|
2016-09-20
|
|
450D DR-BiLSTM
|
DR-BiLSTM: Dependent Reading Bidirectional LSTM f…
|
88.50
|
2018-02-15
|
|
Stochastic Answer Network
|
Stochastic Answer Networks for Natural Language I…
|
88.50
|
2018-04-21
|
|
300D CAFE
|
Compare, Compress and Propagate: Enhancing Neural…
|
88.50
|
2017-12-30
|
|
Biattentive Classification Network + CoVe + Char
|
Learned in Translation: Contextualized Word Vecto…
|
88.10
|
2017-08-01
|
|
aESIM
|
Attention Boosted Sequential Inference Model
|
88.10
|
2018-12-05
|
|
448D Densely Interactive Inference Network (DIIN, code)
|
Natural Language Inference over Interaction Space
|
88.00
|
2017-09-13
|
|
Enhanced Sequential Inference Model (Chen et al., [2017a])
|
Enhanced LSTM for Natural Language Inference
|
88.00
|
2016-09-20
|
|
BiMPM
|
Bilateral Multi-Perspective Matching for Natural …
|
87.50
|
2017-02-13
|
|
2400D Multiple-Dynamic Self-Attention Model
|
Dynamic Self-Attention : Computing Attention over…
|
87.40
|
2018-08-22
|
|
300D Full tree matching NTI-SLSTM-LSTM w/ global attention
|
Neural Tree Indexers for Text Understanding
|
87.30
|
2016-07-15
|
|
300D 2-layer Bi-CAS-LSTM
|
Cell-aware Stacked LSTMs for Modeling Sentences
|
87.00
|
2018-09-07
|
|
200D decomposable attention feed-forward model with intra-sentence attention
|
A Decomposable Attention Model for Natural Langua…
|
86.80
|
2016-06-06
|
|
200D decomposable attention model with intra-sentence attention
|
A Decomposable Attention Model for Natural Langua…
|
86.80
|
2016-06-06
|
|
600D Dynamic Self-Attention Model
|
Dynamic Self-Attention : Computing Attention over…
|
86.80
|
2018-08-22
|
|
CBS-1 + ESIM
|
Parameter Re-Initialization through Cyclical Batc…
|
86.73
|
2018-12-04
|
|
512D Dynamic Meta-Embeddings
|
Dynamic Meta-Embeddings for Improved Sentence Rep…
|
86.70
|
2018-04-21
|
|
600D BiLSTM with generalized pooling
|
Enhancing Sentence Embedding with Generalized Poo…
|
86.60
|
2018-06-26
|
|
600D Hierarchical BiLSTM with Max Pooling (HBMP, code)
|
Sentence Embeddings in NLI with Iterative Refinem…
|
86.60
|
2018-08-27
|
|
Densely-Connected Recurrent and Co-Attentive Network (encoder)
|
Semantic Sentence Matching with Densely-connected…
|
86.50
|
2018-05-29
|
|
300D Reinforced Self-Attention Network
|
Reinforced Self-Attention Network: a Hybrid of Ha…
|
86.30
|
2018-01-31
|
|
Distance-based Self-Attention Network
|
Distance-based Self-Attention Network for Natural…
|
86.30
|
2017-12-06
|
|
200D decomposable attention feed-forward model
|
A Decomposable Attention Model for Natural Langua…
|
86.30
|
2016-06-06
|
|
200D decomposable attention model
|
A Decomposable Attention Model for Natural Langua…
|
86.30
|
2016-06-06
|
|
450D LSTMN with deep attention fusion
|
Long Short-Term Memory-Networks for Machine Readi…
|
86.30
|
2016-01-25
|
|
300D mLSTM word-by-word attention model
|
Learning Natural Language Inference with LSTM
|
86.10
|
2015-12-30
|
|
600D Gumbel TreeLSTM encoders
|
Learning to Compose Task-Specific Tree Structures
|
86.00
|
2017-07-10
|
|
600D Residual stacked encoders
|
Shortcut-Stacked Sentence Encoders for Multi-Doma…
|
86.00
|
2017-08-07
|
|
Star-Transformer (no cross sentence attention)
|
Star-Transformer
|
86.00
|
2019-02-25
|
|
MT-DNN-SMART_1%ofTrainingData
|
SMART: Robust and Efficient Fine-Tuning for Pre-t…
|
86.00
|
2019-11-08
|
|
300D CAFE (no cross-sentence attention)
|
Compare, Compress and Propagate: Enhancing Neural…
|
85.90
|
2017-12-30
|
|
300D Residual stacked encoders
|
Shortcut-Stacked Sentence Encoders for Multi-Doma…
|
85.70
|
2017-08-07
|
|
300D LSTMN with deep attention fusion
|
Long Short-Term Memory-Networks for Machine Readi…
|
85.70
|
2016-01-25
|
|
300D Gumbel TreeLSTM encoders
|
Learning to Compose Task-Specific Tree Structures
|
85.60
|
2017-07-10
|
|
300D Directional self-attention network encoders
|
DiSAN: Directional Self-Attention Network for RNN…
|
85.60
|
2017-09-14
|
|
600D (300+300) Deep Gated Attn. BiLSTM encoders
|
Recurrent Neural Network-Based Sentence Encoder w…
|
85.50
|
2017-08-04
|
|
300D MMA-NSE encoders with attention
|
Neural Semantic Encoders
|
85.40
|
2016-07-14
|
|
50D stacked TC-LSTMs
|
Modelling Interaction of Sentence Pair with coupl…
|
85.10
|
2016-05-18
|
|
600D (300+300) BiLSTM encoders with intra-attention and symbolic preproc.
|
Learning Natural Language Inference using Bidirec…
|
85.00
|
2016-05-30
|
|
Stacked Bi-LSTMs (shortcut connections, max-pooling)
|
Combining Similarity Features and Deep Representa…
|
84.80
|
2018-11-02
|
|
300D NSE encoders
|
Neural Semantic Encoders
|
84.60
|
2016-07-14
|
|
4096D BiLSTM with max-pooling
|
Supervised Learning of Universal Sentence Represe…
|
84.50
|
2017-05-05
|
|
Bi-LSTM sentence encoder (max-pooling)
|
Combining Similarity Features and Deep Representa…
|
84.50
|
2018-11-02
|
|
Stacked Bi-LSTMs (shortcut connections, max-pooling, attention)
|
Combining Similarity Features and Deep Representa…
|
84.40
|
2018-11-02
|
|
600D (300+300) BiLSTM encoders with intra-attention
|
Learning Natural Language Inference using Bidirec…
|
84.20
|
2016-05-30
|
|
SWEM-max
|
Baseline Needs More Love: On Simple Word-Embeddin…
|
83.80
|
2018-05-24
|
|
100D LSTMs w/ word-by-word attention
|
Reasoning about Entailment with Neural Attention
|
83.50
|
2015-09-22
|
|
300D NTI-SLSTM-LSTM encoders
|
Neural Tree Indexers for Text Understanding
|
83.40
|
2016-07-15
|
|
600D (300+300) BiLSTM encoders
|
Learning Natural Language Inference using Bidirec…
|
83.30
|
2016-05-30
|
|
300D SPINN-PI encoders
|
A Fast Unified Model for Parsing and Sentence Und…
|
83.20
|
2016-03-19
|
|
MT-DNN-SMART_0.1%ofTrainingData
|
SMART: Robust and Efficient Fine-Tuning for Pre-t…
|
82.70
|
2019-11-08
|
|
300D Tree-based CNN encoders
|
Natural Language Inference by Tree-Based Convolut…
|
82.10
|
2015-12-28
|
|
1024D GRU encoders w/ unsupervised 'skip-thoughts' pre-training
|
Order-Embeddings of Images and Language
|
81.40
|
2015-11-19
|
|
DELTA (LSTM)
|
DELTA: A DEep learning based Language Technology …
|
80.70
|
2019-08-02
|
|
300D LSTM encoders
|
A Fast Unified Model for Parsing and Sentence Und…
|
80.60
|
2016-03-19
|
|
SplitEE-S
|
SplitEE: Early Exit in Deep Neural Networks with …
|
79.00
|
2023-09-17
|
|
+ Unigram and bigram features
|
A large annotated corpus for learning natural lan…
|
78.20
|
2015-08-21
|
|
100D LSTM encoders
|
A large annotated corpus for learning natural lan…
|
77.60
|
2015-08-21
|
|
Unlexicalized features
|
A large annotated corpus for learning natural lan…
|
50.40
|
2015-08-21
|
|