CodeContests

Name: CodeContests
Published: 2022-02-02
License: CC BY 4.0

Dataset Information

Modalities

Texts

Languages

English

Introduced

2022

License

CC BY 4.0

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

CodeContests is a competitive programming dataset for machine-learning. This dataset was used when training AlphaCode.

It consists of programming problems, from a variety of sources.

Problems include test cases in the form of paired inputs and outputs, as well as both correct and incorrect human solutions in a variety of languages.

Variants: CodeContests

Associated Benchmarks

This dataset is used in 1 benchmark:

Code Generation - Metrics: Test Set pass@1, Test Set pass@5, Val Set pass@1, Val Set pass@5

Recent Benchmark Submissions

Task	Model	Paper	Date
Code Generation	EG-CFG (DeepSeek-V3-0324)	Execution Guided Line-by-Line Code Generation	2025-06-12
Code Generation	CodeSim (GPT4)	CODESIM: Multi-Agent Code Generation and …	2025-02-08
Code Generation	LPW (GPT-4o)	Planning-Driven Programming: A Large Language …	2024-11-21
Code Generation	MapCoder (GPT-4)	MapCoder: Multi-Agent Code Generation for …	2024-05-18
Code Generation	MoTCoder-15B	MoTCoder: Elevating Large Language Models …	2023-12-26
Code Generation	MoTCoder-7B-v1.5	MoTCoder: Elevating Large Language Models …	2023-12-26
Code Generation	CodeChain + WizardCoder-15B	CodeChain: Towards Modular Code Generation …	2023-10-13
Code Generation	WizardCoder-15B	WizardCoder: Empowering Code Large Language …	2023-06-14

Research Papers

Recent papers with results on this dataset:

External Links:

CodeContests

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview