CHOCOLATE

Captions Have Often ChOsen Lies About The Evidence

Dataset Information
Modalities
Images, Texts
Languages
English
Introduced
2023
Homepage

Overview

CHOCOLATE is a benchmark for detecting and correcting factual inconsistency in generated chart captions. It consists of captions produced by six advanced models, which are categorized into three subsets:

  • LVLM: GPT-4V, Bard (before Gemini)
  • LLM-based Pipeline: DePlot + GPT-4
  • Fine-tuned Model: ChartT5, MatCha, UniChart

The charts are from two datasets: VisText and the Pew split of Chart-to-Text. In total, CHOCOLATE consists of 1,187 examples. Each instance in CHOCOLATE consists of a caption generated by one of the models and the annotations of the factual errors for each caption sentence.

Paper Information

Citation

If you use the CHOCOLATE dataset in your work, please kindly cite the paper using this BibTeX:

@misc{huang-etal-2023-do,
    title = "Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning",
    author = "Huang, Kung-Hsiang  and
      Zhou, Mingyang and
      Chan, Hou Pong  and
      Fung, Yi R. and
      Wang, Zhenhailong and
      Zhang, Lingyu and
      Chang, Shih-Fu and
      Ji, Heng",
    year={2023},
    eprint={2312.10160},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}    

Variants: CHOCOLATE, CHOCOLATE-LLM, CHOCOLATE-FT, CHOCOLATE-LVLM

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Factual Inconsistency Detection in Chart Captioning ChartVE Do LVLMs Understand Charts? Analyzing … 2023-12-15

Research Papers

Recent papers with results on this dataset: