ML Research Wiki / Benchmarks / Question Answering / CoQA

CoQA

Question Answering Benchmark

Performance Over Time

📊 Showing 9 results | 📏 Metric: In-domain

Top Performing Models

Rank Model Paper In-domain Date Code
1 GPT-3 175B (few-shot, k=32) Language Models are Few-Shot Learners 85.00 2020-05-28 📦 ggml-org/llama.cpp 📦 ggerganov/llama.cpp 📦 karpathy/llm.c
2 BERT Large Augmented (single model) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 81.10 2018-10-11 📦 huggingface/transformers 📦 tensorflow/models 📦 labmlai/annotated_deep_learning_paper_implementations
3 SDNet (ensemble) SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering 79.30 2018-12-10 📦 Microsoft/SDNet 📦 gooofy/zbrain 📦 code-implementation1/Code7
4 BERT-base finetune (single model) BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 78.10 2018-10-11 📦 huggingface/transformers 📦 tensorflow/models 📦 labmlai/annotated_deep_learning_paper_implementations
5 SDNet (single model) SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering 76.60 2018-12-10 📦 Microsoft/SDNet 📦 gooofy/zbrain 📦 code-implementation1/Code7
6 FlowQA (single model) FlowQA: Grasping Flow in History for Conversational Machine Comprehension 75.00 2018-10-06 📦 momohuang/FlowQA
7 BiDAF++ (single model) A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC 67.80 2018-09-27 📦 my89/co-squac
8 DrQA + seq2seq with copy attention (single model) CoQA: A Conversational Question Answering Challenge 65.10 2018-08-21 📦 stanfordnlp/coqa-baselines 📦 mrzjy/sunburst 📦 leozhoujf/DataSciComp 📦 iit-nlp-research/chatgpt-crawler
9 Vanilla DrQA (single model) CoQA: A Conversational Question Answering Challenge 52.60 2018-08-21 📦 stanfordnlp/coqa-baselines 📦 mrzjy/sunburst 📦 leozhoujf/DataSciComp 📦 iit-nlp-research/chatgpt-crawler

All Papers (9)

Language Models are Few-Shot Learners

2020
GPT-3 175B (few-shot, k=32)