CodeContests

Dataset Information
Modalities
Texts
Languages
English
Introduced
2022
License
Homepage

Overview

CodeContests is a competitive programming dataset for machine-learning. This dataset was used when training AlphaCode.

It consists of programming problems, from a variety of sources.

Problems include test cases in the form of paired inputs and outputs, as well as both correct and incorrect human solutions in a variety of languages.

Variants: CodeContests

Associated Benchmarks

This dataset is used in 1 benchmark:

  • Code Generation -

Recent Benchmark Submissions

Task Model Paper Date
Code Generation EG-CFG (DeepSeek-V3-0324) Execution Guided Line-by-Line Code Generation 2025-06-12
Code Generation CodeSim (GPT4) CODESIM: Multi-Agent Code Generation and … 2025-02-08
Code Generation LPW (GPT-4o) Planning-Driven Programming: A Large Language … 2024-11-21
Code Generation MapCoder (GPT-4) MapCoder: Multi-Agent Code Generation for … 2024-05-18
Code Generation MoTCoder-15B MoTCoder: Elevating Large Language Models … 2023-12-26
Code Generation MoTCoder-7B-v1.5 MoTCoder: Elevating Large Language Models … 2023-12-26
Code Generation CodeChain + WizardCoder-15B CodeChain: Towards Modular Code Generation … 2023-10-13
Code Generation WizardCoder-15B WizardCoder: Empowering Code Large Language … 2023-06-14

Research Papers

Recent papers with results on this dataset: