MedQA

Dataset Information
Introduced
2020
License
Unknown
Homepage

Overview

Multiple choice question answering based on the United States Medical License Exams (USMLE). The dataset is collected from the professional medical board exams. It covers three languages: English, simplified Chinese, and traditional Chinese, and contains 12,723, 34,251, and 14,123 questions for the three languages, respectively.

Variants: MedQA

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Question Answering Shakti-LLM (2.5B) SHAKTI: A 2.5 Billion Parameter … 2024-10-15
Question Answering MedMobile (3.8B) MedMobile: A mobile-sized language model … 2024-10-11
Question Answering Med-Gemini Capabilities of Gemini Models in … 2024-04-29
Question Answering Meerkat-7B Small Language Models Learn Enhanced … 2024-03-30
Question Answering Meerkat-7B (Single) Small Language Models Learn Enhanced … 2024-03-30
Question Answering GPT-4 Can Generalist Foundation Models Outcompete … 2023-11-28
Question Answering LLAMA-2 (70B SC CoT) MEDITRON-70B: Scaling Medical Pretraining for … 2023-11-27
Question Answering LLAMA-2 (70B) MEDITRON-70B: Scaling Medical Pretraining for … 2023-11-27
Question Answering Meditron-70B (CoT + SC) MEDITRON-70B: Scaling Medical Pretraining for … 2023-11-27
Question Answering BioMedGPT-10B BioMedGPT: Open Multimodal Generative Pre-trained … 2023-08-18
Question Answering Med-PaLM 2 (CoT + SC) Towards Expert-Level Medical Question Answering … 2023-05-16
Question Answering Med-PaLM 2 Towards Expert-Level Medical Question Answering … 2023-05-16
Question Answering Med-PaLM 2 (5-shot) Towards Expert-Level Medical Question Answering … 2023-05-16
Question Answering GrapeQA: PEGA GrapeQA: GRaph Augmentation and Pruning … 2023-03-22
Question Answering PubMedGPT (2.7 B) Large Language Models Encode Clinical … 2022-12-26
Question Answering GPT-Neo (2.7 B) Large Language Models Encode Clinical … 2022-12-26
Question Answering Flan-PaLM (540 B) Large Language Models Encode Clinical … 2022-12-26
Question Answering BioLinkBERT (340 M) Large Language Models Encode Clinical … 2022-12-26
Question Answering OPT (few-shot, k=5) Galactica: A Large Language Model … 2022-11-16
Question Answering GAL 120B (zero-shot) Galactica: A Large Language Model … 2022-11-16

Research Papers

Recent papers with results on this dataset: