Meditron-70B (CoT + SC)
|
MEDITRON-70B: Scaling Medical Pretraining for Lar…
|
81.60
|
2023-11-27
|
|
BioGPT-Large(1.5B)
|
BioGPT: Generative Pre-trained Transformer for Bi…
|
81.00
|
2022-10-19
|
|
RankRAG-llama3-70B (Zero-Shot)
|
RankRAG: Unifying Context Ranking with Retrieval-…
|
79.80
|
2024-07-02
|
|
Med-PaLM 2 (5-shot)
|
Towards Expert-Level Medical Question Answering w…
|
79.20
|
2023-05-16
|
|
Flan-PaLM (540B, Few-shot)
|
Large Language Models Encode Clinical Knowledge
|
79.00
|
2022-12-26
|
|
BioGPT(345M)
|
BioGPT: Generative Pre-trained Transformer for Bi…
|
78.20
|
2022-10-19
|
|
Codex 5-shot CoT
|
Can large language models reason about medical qu…
|
78.20
|
2022-07-17
|
|
Human Performance (single annotator)
|
PubMedQA: A Dataset for Biomedical Research Quest…
|
78.00
|
2019-09-13
|
|
MetaGen Blended RAG (zero-shot)
|
MetaGen Blended RAG: Higher Accuracy for Domain-S…
|
77.90
|
2025-05-23
|
|
GAL 120B (zero-shot)
|
Galactica: A Large Language Model for Science
|
77.60
|
2022-11-16
|
|
Flan-PaLM (62B, Few-shot)
|
Large Language Models Encode Clinical Knowledge
|
77.20
|
2022-12-26
|
|
MediSwift-XL
|
MediSwift: Efficient Sparse Pre-trained Biomedica…
|
76.80
|
2024-03-01
|
|
BioMedGPT-10B
|
BioMedGPT: Open Multimodal Generative Pre-trained…
|
76.10
|
2023-08-18
|
|
Flan-PaLM (540B, SC)
|
Large Language Models Encode Clinical Knowledge
|
75.20
|
2022-12-26
|
|
Med-PaLM 2 (ER)
|
Towards Expert-Level Medical Question Answering w…
|
75.00
|
2023-05-16
|
|
Med-PaLM 2 (CoT + SC)
|
Towards Expert-Level Medical Question Answering w…
|
74.00
|
2023-05-16
|
|
BLOOM (zero-shot)
|
Galactica: A Large Language Model for Science
|
73.60
|
2022-11-16
|
|
CoT-T5-11B (1024 Shot)
|
The CoT Collection: Improving Zero-shot and Few-s…
|
73.42
|
2023-05-23
|
|
BioLinkBERT (large)
|
LinkBERT: Pretraining Language Models with Docume…
|
72.20
|
2022-03-29
|
|
BioLinkBERT (base)
|
LinkBERT: Pretraining Language Models with Docume…
|
70.20
|
2022-03-29
|
|
OPT (zero-shot)
|
Galactica: A Large Language Model for Science
|
70.20
|
2022-11-16
|
|
Flan-PaLM (8B, Few-shot)
|
Large Language Models Encode Clinical Knowledge
|
67.60
|
2022-12-26
|
|
PaLM (62B, Few-shot)
|
Large Language Models Encode Clinical Knowledge
|
57.80
|
2022-12-26
|
|
PubMedBERT uncased
|
Domain-Specific Language Model Pretraining for Bi…
|
55.84
|
2020-07-31
|
|
PaLM (540B, Few-shot)
|
Large Language Models Encode Clinical Knowledge
|
55.00
|
2022-12-26
|
|
PaLM (8B, Few-shot)
|
Large Language Models Encode Clinical Knowledge
|
34.00
|
2022-12-26
|
|