← ML Research Wiki / 2403.13372

LLAMAFACTORY: Unified Efficient Fine-Tuning of 100+ Language Models

Yaowei Zheng School of Computer Science and Engineering Beihang University China, Richong Zhang [email protected] School of Computer Science and Engineering Beihang University China, Junhao Zhang [email protected], Yanhan Ye [email protected] School of Computer Science and Engineering Beihang University China, Zheyan Luo School of Computer Science and Engineering Beihang University China, Zhangchi Feng School of Computer Science and Engineering Beihang University China, Yongqiang Ma School of Computer Science and Engineering Beihang University China School of Software and Microelectronics Peking University China (2024)

Paper Information
arXiv ID
Venue
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Domain
Natural language processing
Code
Reproducibility
8/10

Abstract

Efficient fine-tuning is vital for adapting large language models (LLMs) to downstream tasks.However, it requires non-trivial efforts to implement these methods on different models.We present LLAMAFACTORY, a unified framework that integrates a suite of cutting-edge efficient training methods.It provides a solution for flexibly customizing the fine-tuning of 100+ LLMs without the need for coding through the built-in web UI LLAMABOARD.We empirically validate the efficiency and effectiveness of our framework on language modeling and text generation tasks.It has been released at https://github.com/hiyouga/LLaMA-Factoryand received over 25,000 stars and 3,000 forks.

Summary

The paper presents LLAMAFACTORY, a unified framework for efficient fine-tuning of over 100 large language models (LLMs). The framework integrates various cutting-edge training methods and provides an intuitive web UI, LLAMABOARD, for users to customize fine-tuning without coding. LLAMAFACTORY emphasizes efficiency in training through scalable modules—Model Loader, Data Worker, and Trainer— designed to standardize model and data processing. The authors empirically validate the framework's effectiveness on language modeling and text generation tasks, reporting significant improvements in memory usage and training throughput, making LLM fine-tuning accessible to a broader audience.

Methods

This paper employs the following methods:

  • LLAMAFACTORY
  • LLAMABOARD
  • mixed precision training
  • activation checkpointing
  • gradient low-rank projection
  • LoRA
  • QLoRA
  • freeze-tuning
  • flash attention
  • DeepSpeed
  • distributed training

Models Used

  • Gemma-2B
  • Llama2-7B
  • Llama2-13B

Datasets

The following datasets were used in this research:

  • PubMed

Evaluation Metrics

  • PPL
  • ROUGE

Results

  • LLAMAFACTORY improves training efficiency in terms of memory usage, throughput, and perplexity for LLMs.
  • QLoRA achieves the lowest memory footprint during fine-tuning.

Limitations

The authors identified the following limitations:

  • Not specified

Technical Requirements

  • Number of GPUs: 1
  • GPU Type: NVIDIA A100 40GB

Keywords

Large Language Models Efficient fine-tuning LLMAFACTORY Web UI Model customization

Papers Using Similar Methods

External Resources