← ML Research Wiki / 2304.13712

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond

Jingfeng Yang [email protected], Amazon, USAHongye Jin, Ruixiang Tang, Xiaotian Han [email protected], Qizhang Feng, Amazon, USAHaoming Jiang [email protected], Xia Hu [email protected], Department of Computer Science and Engineering Texas A&M University USA, Department of Computer Science Rice University USA, Department of Computer Science and Engineering Texas A&M University USA, Department of Computer Science and Engineering Texas A&M University USA, BING YIN AmazonUSA, Department of Computer Science Rice University USA, Department of Computer Science and Engineering Texas A&M University USA; Ruixiang Tang, Department of Computer Science Rice University USA, Xiaotian Han, Department of Computer Science and Engineering Texas A&M University USA, Department of Computer Science and Engineering Texas A&M University Haoming JiangUSA, Xia Hu USA, Department of Computer Science Rice University USA, Jingfeng Yang Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, and Xia HuHongye Jin (2023)

Paper Information

arXiv ID

2304.13712

Venue

ACM Transactions on Knowledge Discovery from Data

Domain

Artificial Intelligence, Machine Learning, Natural Language Processing

Contents

Abstract
Methods
Datasets
Results
Limitations
Related Work
External Resources

Abstract

This paper presents a comprehensive and practical guide for practitioners and end-users working with Large Language Models (LLMs) in their downstream natural language processing (NLP) tasks.We provide discussions and insights into the usage of LLMs from the perspectives of models, data, and downstream tasks.Firstly, we offer an introduction and brief summary of current GPT-and BERT-style LLMs.Then, we discuss the influence of pre-training data, training data, and test data.Most importantly, we provide a detailed discussion about the use and non-use cases of large language models for various natural language processing tasks, such as knowledge-intensive tasks, traditional natural language understanding tasks, natural language generation tasks, emergent abilities, and considerations for specific tasks.We present various use cases and non-use cases to illustrate the practical applications and limitations of LLMs in real-world scenarios.We also try to understand the importance of data and the specific challenges associated with each NLP task.Furthermore, we explore the impact of spurious biases on LLMs and delve into other essential considerations, such as efficiency, cost, and latency, to ensure a comprehensive understanding of deploying LLMs in practice.This comprehensive guide aims to provide researchers and practitioners with valuable insights and best practices for working with LLMs, thereby enabling the successful implementation of these models in a wide range of NLP tasks.A curated list of practical guide resources of LLMs, regularly updated, can be found at https://github.com/Mooler0410/LLMsPracticalGuide.

Summary

This paper serves as a comprehensive guide for practitioners using Large Language Models (LLMs) in natural language processing tasks. It offers insights into the utilization of LLMs such as ChatGPT and discusses various models, data influences, and downstream tasks. The paper categorizes LLMs, emphasizing the distinctions between encoder-only and decoder-only models, exploring their training strategies, architectures, and practical applications. It highlights the significance of pre-training data, the considerations for zero, few, and abundant annotated data, and addresses performance in various NLP tasks including NLU and NLG. The paper also discusses biases, efficiency concerns, and challenges in real-world deployments, concluding with considerations for model evaluation and future challenges such as alignment with human values and the implications of scaling. The authors aim to support better understanding and implementation of LLMs across NLP tasks to promote innovation.

Methods

This paper employs the following methods:

LLM
encoder-decoder
decoder-only
fine-tuning
zero-shot
few-shot
transfer learning

Models Used

ChatGPT
GPT-3
GPT-4
PaLM
BLOOM
OPT
T5
RoBERTa
BERT

Datasets

The following datasets were used in this research:

SQuAD
IMDB
GLUE
CoNLL03
CivilComments
DDXPlus
MS MARCO
NaturalQuestions
TriviaQA
GSM8k

Evaluation Metrics

ROUGE
BLEU
Accuracy
F1-score

Results

LLMs excel at generation tasks due to their understanding and creativity
LLMs have significant real-world knowledge compared to fine-tuned models
LLMs outperform fine-tuned models in zero-shot and few-shot scenarios

Limitations

The authors identified the following limitations:

LLMs can struggle with specific out-of-distribution tasks
Fine-tuned models often outperform LLMs in traditional NLU tasks
Challenges in handling noisy and unstructured real-world input

Technical Requirements

Number of GPUs: None specified
GPU Type: None specified

Keywords

Large Language Models ChatGPT BERT Language Model Training NLP Tasks Model Scaling Data Pretraining Fine-tuning Prompt Engineering Model Evaluation Bias and Fairness Safety in AI

Papers Using Similar Methods

External Resources

Funding: Not specified
References: 171
Influential Citations: 15

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Limitations add

Technical Requirements edit

Keywords add

Related Papers