ML Research Wiki / Benchmarks / Code Generation / APPS

APPS

Code Generation Benchmark

Performance Over Time

📊 Showing 18 results | 📏 Metric: Introductory Pass@1

Top Performing Models

Rank Model Paper Introductory Pass@1 Date Code
1 LPW (GPT-4o) Planning-Driven Programming: A Large Language Model Programming Workflow 87.20 2024-11-21 📦 you68681/lpw
2 MoTCoder-32B-V1.5 MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks 68.44 2023-12-26 📦 dvlab-research/motcoder
3 MoTCoder-7B-V1.5 MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks 54.26 2023-12-26 📦 dvlab-research/motcoder
4 deepseek-ai/deepseek-coder-6.7b-instruct DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence 33.80 2024-01-25 📦 deepseek-ai/DeepSeek-Coder
5 code-davinci-002 175B CodeT: Code Generation with Generated Tests 31.92 2022-07-21 📦 microsoft/codet
6 WizardCoder-15b CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules 26.29 2023-10-13 📦 SalesforceAIResearch/CodeChain
7 CodeSim (GPT4) CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging 26.04 2025-02-08 📦 kagnlp/CodeGenerator
8 AlphaCode 1B Filtered from 50000 Competition-Level Code Generation with AlphaCode 22.00 2022-02-08 📦 google-deepmind/code_contests 📦 deepmind/code_contests
9 CodeRL+CodeT5 CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning 20.00 2022-07-05 📦 salesforce/codet5 📦 salesforce/coderl
10 code-davinci-002 175B (CodeT) CodeT: Code Generation with Generated Tests 0.00 2022-07-21 📦 microsoft/codet

All Papers (18)