DeepSeek-R1 (MGDebugger)
|
From Code to Correctness: Closing the Last Mile o…
|
100.00
|
2024-10-02
|
|
LLaMA 3
|
Debug like a Human: A Large Language Model Debugg…
|
99.40
|
2024-02-25
|
|
QualityFlow (Sonnet-3.5)
|
QualityFlow: An Agentic Workflow for Program Synt…
|
98.80
|
2025-01-20
|
|
Phi-2
|
Planning-Driven Programming: A Large Language Mod…
|
98.20
|
2024-11-21
|
|
EG-CFG (DeepSeek-V3-0324)
|
Execution Guided Line-by-Line Code Generation
|
96.95
|
2025-06-12
|
|
Mistral 7B
|
MapCoder: Multi-Agent Code Generation for Competi…
|
93.90
|
2024-05-18
|
|
L2MAC (GPT-4)
|
L2MAC: Large Language Model Automatic Computer fo…
|
90.20
|
2023-10-02
|
|