RoFT-chatgpt

Dataset Information

Languages

English

Introduced

2023

License

MIT

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

RoFT-chatgpt is a variation of RoFT dataset, where the same human prompts are continued with the gpt-3.5-turbo model. Each dataset sample consists of ten sentences, with the first part written by a human and the remainder completed by an LLM. Consequently, every sample has a boundary indicating the index of the sentence where authorship changes.

Variants: RoFT-chatgpt

Associated Benchmarks

This dataset is used in 1 benchmark:

Boundary Detection - Metrics: Accuracy (%), MSE

Recent Benchmark Submissions

Task	Model	Paper	Date
Boundary Detection	GigaCheck (DN-DAB-DETR)	GigaCheck: Detecting LLM-generated Content	2024-10-31
Boundary Detection	RoBERTa + SEP	AI-generated text boundary detection with …	2023-11-14
Boundary Detection	TLE + TS Binary	AI-generated text boundary detection with …	2023-11-14
Boundary Detection	PHD + TS ML	AI-generated text boundary detection with …	2023-11-14

Research Papers

Recent papers with results on this dataset:

GigaCheck: Detecting LLM-generated Content (2024) -
AI-generated text boundary detection with RoFT (2023) -

External Links:

RoFT-chatgpt

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview