LAMBADA

Name: LAMBADA
Published: 2016-01-01
License: CC BY 4.0

Dataset Information

Modalities

Texts

Languages

English

Introduced

2016

License

CC BY 4.0

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

The LAMBADA (LAnguage Modeling Broadened to Account for Discourse Aspects) benchmark is an open-ended cloze task which consists of about 10,000 passages from BooksCorpus where a missing target word is predicted in the last sentence of each passage. The missing word is constrained to always be the last word of the last sentence and there are no candidate words to choose from. Examples were filtered by humans to ensure they were possible to guess given the context, i.e., the sentences in the passage leading up to the last sentence. Examples were further filtered to ensure that missing words could not be guessed without the context, ensuring that models attempting the dataset would need to reason over the entire paragraph to answer questions.

Source: Recent Advances in Natural Language Inference:A Survey of Benchmarks, Resources, and Approaches
Image Source: https://arxiv.org/pdf/1606.06031.pdf

Variants: LAMBADA

Associated Benchmarks

This dataset is used in 1 benchmark:

Language Modelling - Metrics: Accuracy, Perplexity

Recent Benchmark Submissions

Task	Model	Paper	Date
Language Modelling	Mamba-2.8B	Mamba: Linear-Time Sequence Modeling with …	2023-12-01
Language Modelling	LLaMA-65B+CFG (Zero-Shot)	Stay on topic with Classifier-Free …	2023-06-30
Language Modelling	LLaMA-30B+CFG (zero-shot)	Stay on topic with Classifier-Free …	2023-06-30
Language Modelling	LLaMA-13B+CFG (zero-shot)	Stay on topic with Classifier-Free …	2023-06-30
Language Modelling	PaLM 2-M (one-shot)	PaLM 2 Technical Report	2023-05-17
Language Modelling	PaLM 2-L (one-shot)	PaLM 2 Technical Report	2023-05-17
Language Modelling	PaLM 2-S (one-shot)	PaLM 2 Technical Report	2023-05-17
Language Modelling	Pythia 12B (0-shot)	Pythia: A Suite for Analyzing …	2023-04-03
Language Modelling	Pythia 6.9B(Zero-Shot)	Pythia: A Suite for Analyzing …	2023-04-03
Language Modelling	Pythia 12B(Zero-Shot)	Pythia: A Suite for Analyzing …	2023-04-03
Language Modelling	Pythia 6.9B (0-shot)	Pythia: A Suite for Analyzing …	2023-04-03
Language Modelling	SparseGPT (175B, 2:4 Sparsity)	SparseGPT: Massive Language Models Can …	2023-01-02
Language Modelling	SparseGPT (175B, 4:8 Sparsity)	SparseGPT: Massive Language Models Can …	2023-01-02
Language Modelling	SparseGPT (175B, 50% Sparsity)	SparseGPT: Massive Language Models Can …	2023-01-02
Language Modelling	OPT-175B (50% Sparsity)	SparseGPT: Massive Language Models Can …	2023-01-02
Language Modelling	OPT-175B	SparseGPT: Massive Language Models Can …	2023-01-02
Language Modelling	GLM-130B (bidirectional attention)	GLM-130B: An Open Bilingual Pre-trained …	2022-10-05
Language Modelling	PaLM-540B (One-Shot)	PaLM: Scaling Language Modeling with …	2022-04-05
Language Modelling	PaLM-540B (Few-Shot)	PaLM: Scaling Language Modeling with …	2022-04-05
Language Modelling	PaLM-540B (Zero-Shot)	PaLM: Scaling Language Modeling with …	2022-04-05

Research Papers

Recent papers with results on this dataset:

Mamba: Linear-Time Sequence Modeling with Selective State Spaces (2023) -
Stay on topic with Classifier-Free Guidance (2023) -
PaLM 2 Technical Report (2023) -
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling (2023) -
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot (2023) -
GLM-130B: An Open Bilingual Pre-trained Model (2022) -
PaLM: Scaling Language Modeling with Pathways (2022) -
Training Compute-Optimal Large Language Models (2022) -
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model (2022) -
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts (2021) -
GLM: General Language Model Pretraining with Autoregressive Blank Infilling (2021) -
Language Models are Few-Shot Learners (2020) -
Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences (2020) -
Test-Time Training with Self-Supervision for Generalization under Distribution Shifts (2019) -
Universal Transformers (2018) -
Broad Context Language Modeling as Reading Comprehension (2016) -

External Links:

LAMBADA

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview