ML Research Wiki / Benchmarks / Language Modelling / Text8

Text8

Language Modelling Benchmark

Performance Over Time

📊 Showing 22 results | 📏 Metric: Bit per Character (BPC)

Top Performing Models

Rank Model Paper Bit per Character (BPC) Date Code
1 td-LSTM (Zhang et al., 2016) Architectural Complexity Measures of Recurrent Neural Networks 1.63 2016-02-26 -
2 td-LSTM-large Architectural Complexity Measures of Recurrent Neural Networks 1.49 2016-02-26 -
3 BFN Bayesian Flow Networks 1.41 2023-08-14 📦 nnaisense/bayesian-flow-networks
4 Unregularised mLSTM Multiplicative LSTM for sequence modelling 1.40 2016-09-26 📦 astakara48/python_project
5 BN LSTM Recurrent Batch Normalization 1.36 2016-03-30 📦 cooijmanstim/recurrent-batch-normalization 📦 codedecde/Recognizing-Textual-Entailment 📦 Tetsuya-Nishikawa/ConvLSTM_DEMO
6 LayerNorm HM-LSTM Hierarchical Multiscale Recurrent Neural Networks 1.29 2016-09-06 📦 bolducp/hierarchical-rnn 📦 kaiu85/hm-rnn 📦 nikolasthuesen/HMLSTM
7 Large RHN Recurrent Highway Networks 1.27 2016-07-12 📦 labmlai/annotated_deep_learning_paper_implementations 📦 julian121266/RecurrentHighwayNetworks 📦 jzilly/RecurrentHighwayNetworks
8 Large mLSTM +emb +WN +VD Multiplicative LSTM for sequence modelling 1.27 2016-09-26 📦 astakara48/python_project
9 Bipartite flows (8 flows) Discrete Flows: Invertible Generative Models of Discrete Data 1.23 2019-05-24 📦 google/edward2 📦 TrentBrick/PyTorchDiscreteFlows
10 mLSTM + dynamic eval Dynamic Evaluation of Neural Sequence Models 1.19 2017-09-21 📦 benkrause/dynamic-evaluation 📦 benkrause/dynamiceval-transformer 📦 sacmehta/PRU

All Papers (22)