Lei Huang [email protected] Harbin Institute of Technology HarbinChina, Weijiang Yu [email protected] Huawei Inc ShenzhenChina, Weitao Ma [email protected] Harbin Institute of Technology HarbinChina, Weihong Zhong [email protected] Harbin Institute of Technology HarbinChina, Zhangyin Feng [email protected] Harbin Institute of Technology HarbinChina, Haotian Wang Harbin Institute of Technology HarbinChina, Qianglong Chen [email protected] Huawei Inc ShenzhenChina, Weihua Peng Huawei Inc ShenzhenChina, Xiaocheng Feng xcfeng†@ir.hit.edu.cn Harbin Institute of Technology HarbinChina, Bing Qin [email protected] Harbin Institute of Technology HarbinChina, Ting Liu [email protected] Harbin Institute of Technology HarbinChina (2023)
This academic survey investigates hallucination in large language models (LLMs), highlighting their tendency to generate plausible but factually incorrect content and the implications for their reliability in real-world scenarios. It presents a new taxonomy of hallucination types: factuality hallucinations (discrepancies with verifiable facts) and faithfulness hallucinations (divergences from user instructions). The survey outlines contributing factors to these hallucinations—data flaws, training issues, and inference complications—while also reviewing various detection methods and evaluation benchmarks. Additionally, it discusses mitigation strategies aimed at improving the alignment of LLM outputs with factual correctness and user directives, highlighting the ongoing challenges and concerns that remain in ensuring LLM reliability and safety in practical applications. Finally, the paper identifies open questions for future research in the realm of LLM hallucinations, advocating for a deeper understanding of their sources and solutions.
The following datasets were used in this research: