← ML Research Wiki / 2303.17564

BloombergGPT: A Large Language Model for Finance

Shijie Wu Bloomberg, New YorkNYUSA, Ozanirsoy Bloomberg, New YorkNYUSA, Steven Lu Bloomberg, New YorkNYUSA, Vadim Dabravolski Bloomberg, New YorkNYUSA, Mark Dredze Bloomberg, New YorkNYUSA Computer Science Johns Hopkins University BaltimoreMDUSA, Sebastian Gehrmann Bloomberg, New YorkNYUSA, Prabhanjan Kambadur Bloomberg, New YorkNYUSA, David Rosenberg Bloomberg, New YorkNYUSA, Gideon Mann Bloomberg, New YorkNYUSA (2023)

Paper Information
arXiv ID
Venue
arXiv.org
Domain
natural language processing, finance, machine learning
SOTA Claim
Yes
Reproducibility
6/10

Abstract

The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models (LLMs) have been shown to be effective on a variety of tasks; however, no LLM specialized for the financial domain has been reported in literature. In this work, we present BloombergGPT, a 50 billion parameter language model that is trained on a wide range of financial data. We construct a 363 billion token dataset based on Bloomberg's extensive data sources, perhaps the largest domain-specific dataset yet, augmented with 345 billion tokens from general purpose datasets. We validate BloombergGPT on standard LLM benchmarks, open financial benchmarks, and a suite of internal benchmarks that most accurately reflect our intended usage. Our mixed dataset training leads to a model that outperforms existing models on financial tasks by significant margins without sacrificing performance on general LLM benchmarks. Additionally, we explain our modeling choices, training process, and evaluation methodology. As a next step, we plan to release training logs (Chronicles) detailing our experience in training BloombergGPT. * . Co-first authors.

Summary

The paper presents BloombergGPT, a large language model specifically designed for financial content, consisting of 50 billion parameters and trained on a mixed dataset comprising 363 billion tokens of domain-specific financial documents and 345 billion tokens from general-purpose datasets. The authors emphasize that prior LLMs were not specialized for finance and that BloombergGPT surpasses existing models on financial benchmarks while performing competitively on general LLM tasks. Key contributions include the development of the FinPile dataset, evaluation methodology incorporating public and internal benchmarks, and insights into training and tokenization strategies. Future work includes releasing training logs and further evaluations.

Methods

This paper employs the following methods:

  • Transformer

Models Used

  • BloombergGPT

Datasets

The following datasets were used in this research:

  • FinPile

Evaluation Metrics

  • F1 score
  • Exact match accuracy
  • Weighted F1 score

Results

  • BloombergGPT outperforms existing models on financial tasks
  • Competitive performance on general LLM benchmarks

Limitations

The authors identified the following limitations:

  • BloombergGPT trained on proprietary data cannot be released publicly
  • Challenges in evaluating domain-specific benchmarks

Technical Requirements

  • Number of GPUs: 512
  • GPU Type: NVIDIA 40GB A100

Keywords

BloombergGPT large language model finance domain-specific dataset training process evaluation

Papers Using Similar Methods

External Resources