ML Research Wiki / Benchmarks / Question Answering / WebQuestions

WebQuestions

Question Answering Benchmark

Performance Over Time

📊 Showing 36 results | 📏 Metric: EM

Top Performing Models

Rank Model Paper EM Date Code
1 CoA Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models 70.70 2024-03-26 📦 MAGICS-LAB/Chain-of-Actions
2 CoA w/o actions Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models 64.70 2024-03-26 📦 MAGICS-LAB/Chain-of-Actions
3 DSP DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines 59.40 2023-10-05 📦 stanfordnlp/dsp 📦 stanfordnlp/dspy 📦 codelion/optillm
4 DSP Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models 59.40 2024-03-26 📦 MAGICS-LAB/Chain-of-Actions
5 FiE+PAQ 📚 FiE: Building a Global Probability Space by Leveraging Early Fusion in Encoder for Open-Domain Question Answering 56.30 2022-11-18 -
6 FiE FiE: Building a Global Probability Space by Leveraging Early Fusion in Encoder for Open-Domain Question Answering 52.40 2022-11-18 -
7 FiDO FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference 51.10 2022-12-15 -
8 RAG Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks 45.20 2020-05-22 📦 huggingface/transformers 📦 assafelovic/gpt-researcher 📦 deepset-ai/haystack
9 Few-shot Language Models are Few-Shot Learners 44.70 2020-05-28 📦 ggml-org/llama.cpp 📦 ggerganov/llama.cpp 📦 karpathy/llm.c
10 Few-shot Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models 44.70 2024-03-26 📦 MAGICS-LAB/Chain-of-Actions

All Papers (36)

Language Models are Few-Shot Learners

2020
GPT-3-175B (Few-Shot)

Language Models are Few-Shot Learners

2020
GPT-3-175B (One-Shot)

Language Models are Few-Shot Learners

2020
GPT-3-175B (Zero-Shot)