← ML Research Wiki / 2403.10131

RAFT: Adapting Language Model to Domain Specific RAG

Tianjun Zhang [email protected], Shishir G Patil [email protected], Naman Jain [email protected], Sheng Shen [email protected], Matei Zaharia, Ion Stoica [email protected], Joseph E Gonzalez [email protected], Department of Computer Science UC 94720Berkeley BerkeleyCAUSA, Department of Computer Science UC 94720Berkeley BerkeleyCAUSA, Department of Computer Science UC 94720Berkeley BerkeleyCAUSA (2024)

Paper Information

arXiv ID

2403.10131

Venue

arXiv.org

Domain

natural language processing

Code

Available

Reproducibility

8/10

Contents

Abstract
Methods
Datasets
Results
Limitations
Related Work
External Resources

Abstract

Pretraining Large Language Models (LLMs) on large corpora of textual data is now a standard paradigm.When using these LLMs for many downstream applications, it is common to additionally incorporate new information into the pretrained model either through RAG-based-prompting, or finetuning.However, the best methodology to incorporate information remains an open question.In this paper, we present Retrieval Augmented Fine Tuning (RAFT), a training recipe which improves the model's ability to answer questions in "open-book" in-domain settings.In training RAFT, given a question, and a set of retrieved documents, we train the model to ignore those documents that don't help in answering the question, which we call, distractor documents.RAFT accomplishes this by citing verbatim the right sequence from the relevant document to help answer the question.This coupled with RAFT's chain-of-thought-style response helps improve the model's ability to reason.In domain specific RAG, RAFT consistently improves the model's performance across PubMed, HotpotQA, and Gorilla datasets, presenting a post-training recipe to improve pre-trained LLMs to in-domain RAG.

Summary

This paper presents the Retrieval Augmented Fine Tuning (RAFT) method, designed to enhance the performance of large language models (LLMs) in answering questions with a focus on domain-specific retrieval-augmented generation (RAG) tasks. RAFT trains the model to recognize and disregard distractor documents that do not contribute to answering a question, while utilizing relevant documents to generate answers, thus improving reasoning. The methodology combines instruction fine-tuning with RAG, facilitating the adaptation of pre-trained LLMs to specific domains. The authors evaluate RAFT on several datasets and find it consistently outperforms traditional finetuning methods, highlighting the importance of incorporating distractor documents during training and the effectiveness of a chain-of-thought response format.

Methods

This paper employs the following methods:

Retrieval-Augmented Fine Tuning (RAFT)
Supervised Fine-Tuning (SFT)

Models Used

LLaMA-2
GPT-3.5

Datasets

The following datasets were used in this research:

PubMed
HotpotQA
Gorilla
Natural Questions (NQ)
Trivia QA

Evaluation Metrics

Accuracy

Results

RAFT consistently improves model performance across datasets like PubMed, HotpotQA, and Gorilla.
RAFT outperformed existing domain-specific finetuning methods with or without RAG.

Limitations

The authors identified the following limitations:

The model may still struggle with irrelevant text and requires careful tuning of training data to balance relevant and irrelevant information.

Technical Requirements

Number of GPUs: None specified
GPU Type: None specified

Keywords

language models retrieval-augmented generation domain adaptation fine-tuning chain-of-thought

Papers Using Similar Methods

External Resources

Funding: Not specified
References: 57
Influential Citations: 10

RAFT: Adapting Language Model to Domain Specific RAG

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Limitations add

Technical Requirements edit

Keywords add

Related Papers