SpanBERT
|
SpanBERT: Improving Pre-training by Representing …
|
84.80
|
2019-07-24
|
|
Cluster-Former (#C=512)
|
Cluster-Former: Clustering-based Sparse Transform…
|
68.00
|
2020-09-13
|
|
Locality-Sensitive Hashing
|
Reformer: The Efficient Transformer
|
66.00
|
2020-01-13
|
|
Multi-passage BERT
|
Multi-passage BERT: A Globally Normalized BERT Mo…
|
65.10
|
2019-08-22
|
|
Sparse Attention
|
Generating Long Sequences with Sparse Transformers
|
64.70
|
2019-04-23
|
|
DecaProp
|
Densely Connected Attention Propagation for Readi…
|
63.60
|
2018-11-10
|
|
DECAPROP
|
Densely Connected Attention Propagation for Readi…
|
62.20
|
2018-11-10
|
|
AMANDA
|
A Question-Focused Multi-Factor Attention Network…
|
56.60
|
2018-01-25
|
|
R^3
|
R$^3$: Reinforced Reader-Ranker for Open-Domain Q…
|
55.30
|
2017-08-31
|
|
Focused Hierarchical RNN
|
Focused Hierarchical RNNs for Conditional Sequenc…
|
53.40
|
2018-06-12
|
|
DrQA
|
Reading Wikipedia to Answer Open-Domain Questions
|
41.90
|
2017-03-31
|
|
ASR
|
Text Understanding with the Attention Sum Reader …
|
22.80
|
2016-03-04
|
|