← ML Research Wiki / 2305.10601

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Shunyu Yao, Dian Yu, Google Deepmind, Jeffrey Zhao, Thomas L Griffiths, Yuan Cao, Karthik Narasimhan, Princeton University, Izhak Shafran Google DeepMind, Princeton University, Princeton University (2023)

Paper Information

arXiv ID

2305.10601

Venue

Neural Information Processing Systems

Domain

Artificial Intelligence, Natural Language Processing

Code

Available

Reproducibility

8/10

Contents

Abstract
Methods
Datasets
Results
Limitations
Related Work
External Resources

Abstract

Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference.This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role.To surmount these challenges, we introduce a new framework for language model inference, "Tree of Thoughts" (ToT), which generalizes over the popular "Chain of Thought" approach to prompting language models, and enables exploration over coherent units of text ("thoughts") that serve as intermediate steps toward problem solving.ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices.Our experiments show that ToT significantly enhances language models' problem-solving abilities on three novel tasks requiring non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords.For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4% of tasks, our method achieved a success rate of 74%.Code repo with all prompts: https://github.com/princeton-nlp/tree-of-thought-llm.37th Conference on Neural Information Processing Systems (NeurIPS 2023).

Summary

The paper introduces the Tree of Thoughts (ToT) framework for enhancing problem solving in large language models (LMs). Traditional LMs, while capable of extensive textual tasks, face limitations in tasks requiring strategic reasoning and planning due to their autoregressive, token-level decision-making processes. The ToT framework allows LMs to explore multiple reasoning paths, evaluate choices, and backtrack when necessary by treating problem-solving as a search over a tree, where each node represents coherent thoughts leading to a solution. The authors empirically demonstrate the effectiveness of ToT through experiments on three novel tasks: Game of 24, Creative Writing, and Mini Crosswords, achieving substantial improvements in success rates compared to traditional prompting methods. The framework integrates recent cognitive scientific insights on human decision-making, specifically the distinction between fast (System 1) and deliberative (System 2) reasoning, proposing a structured approach to enhance LMs' problem-solving capabilities without the need for additional training.

Methods

This paper employs the following methods:

Tree of Thoughts (ToT)
Breadth-First Search (BFS)
Depth-First Search (DFS)

Models Used

GPT-4

Datasets

The following datasets were used in this research:

Game of 24
Creative Writing
Mini Crosswords

Evaluation Metrics

Success Rate
Coherency Score

Results

ToT achieved a success rate of 74% on Game of 24 compared to 4% with chain-of-thought prompting.
ToT generated more coherent passages with an average score of 7.56 compared to 6.19 for IO and 6.93 for CoT in Creative Writing.
ToT improved word-level success rates to 60% in Mini Crosswords.

Limitations

The authors identified the following limitations:

Not specified

Technical Requirements

Number of GPUs: None specified
GPU Type: None specified

Keywords

Tree of Thoughts problem solving search algorithms prompting techniques large language models

Papers Using Similar Methods

External Resources

Funding: Oracle Collaborative Research award, National Science Foundation, Harold W. Dodds Fellowship from Princeton
References: 52
Influential Citations: 133

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Limitations add

Technical Requirements edit

Keywords add

Related Papers