SAFIM

Syntax-Aware Fill-In-the-Middle

Dataset Information
Modalities
Texts
Languages
English
Introduced
2024
License
Homepage

Overview

Syntax-Aware Fill-in-the-Middle (SAFIM) is a benchmark for evaluating Large Language Models (LLMs) on the code Fill-in-the-Middle (FIM) task. SAFIM has three subtasks: Algorithmic Block Completion, Control-Flow Expression Completion, and API Function Call Completion. SAFIM is sourced from code submitted from April 2022 to January 2023 to minimize the impact of data contamination on evaluation results.

The SAFIM benchmark is partially derived from problem descriptions and code solutions from https://codeforces.com. According to the license of CodeForces, you may publish the texts of Codeforces problems in any open sources, but you must preserve a direct link to the site.

Variants: SAFIM

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Code Completion deepseek-coder-33b-base Evaluation of LLMs on Syntax-Aware … 2024-03-07
Code Completion deepseek-coder-6.7b-base Evaluation of LLMs on Syntax-Aware … 2024-03-07
Code Completion starcoderbase Evaluation of LLMs on Syntax-Aware … 2024-03-07
Code Completion gpt-4-1106-preview Evaluation of LLMs on Syntax-Aware … 2024-03-07
Code Completion CodeLlama-13b-hf Evaluation of LLMs on Syntax-Aware … 2024-03-07
Code Completion deepseek-coder-1.3b-base Evaluation of LLMs on Syntax-Aware … 2024-03-07
Code Completion CodeLlama-34b-hf Evaluation of LLMs on Syntax-Aware … 2024-03-07
Code Completion CodeLlama-7b-hf Evaluation of LLMs on Syntax-Aware … 2024-03-07
Code Completion gpt-3.5-turbo-0301 Evaluation of LLMs on Syntax-Aware … 2024-03-07
Code Completion incoder-6B Evaluation of LLMs on Syntax-Aware … 2024-03-07
Code Completion codegen-16B-multi Evaluation of LLMs on Syntax-Aware … 2024-03-07
Code Completion codegen-2B-multi Evaluation of LLMs on Syntax-Aware … 2024-03-07
Code Completion incoder-1B Evaluation of LLMs on Syntax-Aware … 2024-03-07
Code Completion codegen-6B-multi Evaluation of LLMs on Syntax-Aware … 2024-03-07
Code Completion codegen-350M-multi Evaluation of LLMs on Syntax-Aware … 2024-03-07

Research Papers

Recent papers with results on this dataset: