ML Research Wiki / Benchmarks / Question Answering / TruthfulQA

TruthfulQA

Question Answering Benchmark

Performance Over Time

📊 Showing 30 results | 📏 Metric: MC1

Top Performing Models

Rank Model Paper MC1 Date Code
1 Shakti-LLM (2.5B) SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments 68.40 2024-10-15 -
2 CoA Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models 67.30 2024-03-26 📦 MAGICS-LAB/Chain-of-Actions
3 ToT Tree of Thoughts: Deliberate Problem Solving with Large Language Models 66.60 2023-05-17 📦 ysymyth/tree-of-thought-llm 📦 princeton-nlp/tree-of-thought-llm 📦 codelion/optillm
4 CoA w/o actions Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models 63.30 2024-03-26 📦 MAGICS-LAB/Chain-of-Actions
5 LLaMA 65B LLaMA: Open and Efficient Foundation Language Models 53.00 2023-02-27 📦 huggingface/transformers 📦 ggml-org/llama.cpp 📦 ggerganov/llama.cpp
6 LLaMA 33B LLaMA: Open and Efficient Foundation Language Models 48.00 2023-02-27 📦 huggingface/transformers 📦 ggml-org/llama.cpp 📦 ggerganov/llama.cpp
7 Auto-CoT Automatic Chain of Thought Prompting in Large Language Models 42.20 2022-10-07 📦 microsoft/guidance 📦 guidance-ai/guidance 📦 amazon-research/auto-cot 📦 amazon-science/auto-cot 📦 lastmile-ai/aiconfig
8 LLaMA 13B LLaMA: Open and Efficient Foundation Language Models 41.00 2023-02-27 📦 huggingface/transformers 📦 ggml-org/llama.cpp 📦 ggerganov/llama.cpp
9 LLaMA 7B LLaMA: Open and Efficient Foundation Language Models 29.00 2023-02-27 📦 huggingface/transformers 📦 ggml-org/llama.cpp 📦 ggerganov/llama.cpp
10 GPT-4 (RLHF) GPT-4 Technical Report 0.59 2023-03-15 📦 openai/evals 📦 shmsw25/factscore 📦 unispac/visual-adversarial-examples-jailbreak-large-language-models

All Papers (30)