Bamboogle

Dataset Information
Introduced
2022
License
Unknown
Homepage

Overview

The Bamboogle dataset is a collection of questions that was constructed to investigate the ability of language models to perform compositional reasoning tasks. The dataset is made up of questions that Google answers incorrectly. It covers many different types of questions on various areas, written in unique ways.

Variants: Bamboogle

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Question Answering ReST meets ReAct (PaLM 2-L + Google Search) ReST meets ReAct: Self-Improvement for … 2023-12-15
Question Answering FireAct FireAct: Toward Language Agent Fine-tuning 2023-10-09
Question Answering RALM (LLaMA2-13B + Google Search) Making Retrieval-Augmented Language Models Robust … 2023-10-02
Question Answering MCR (code-davinci-002) + Google Search Answering Questions by Meta-Reasoning over … 2023-04-25
Question Answering Direct Prompting (GPT-3; davinci-002) Measuring and Narrowing the Compositionality … 2022-10-07
Question Answering Google Search Measuring and Narrowing the Compositionality … 2022-10-07
Question Answering Self-ask (GPT-3; davinci-002) + Google Search Measuring and Narrowing the Compositionality … 2022-10-07
Question Answering Self-ask (GPT-3; davinci-002) Measuring and Narrowing the Compositionality … 2022-10-07
Question Answering Chain-of-Thought (GPT-3; davinci-002) Measuring and Narrowing the Compositionality … 2022-10-07

Research Papers

Recent papers with results on this dataset: