TweepFake

Dataset Information

Modalities

Texts

Languages

English

Introduced

2020

License

MIT

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

The TweepFake dataset consists of 25,572 social media messages posted either by bots or humans on Twitter. Each bot imitated a human account and was based on various generative techniques, including Markov Chains, RNN, RNN+Markov, LSTM, and GPT-2.

Variants: TweepFake

Associated Benchmarks

This dataset is used in 1 benchmark:

Binary text classification - Metrics: F1 score, Accuracy (%)

Recent Benchmark Submissions

Task	Model	Paper	Date
Binary text classification	GigaCheck (Mistral-7B)	GigaCheck: Detecting LLM-generated Content	2024-10-31
Binary text classification	XLNet	TweepFake: about Detecting Deepfake Tweets	2020-07-31

Research Papers

Recent papers with results on this dataset:

GigaCheck: Detecting LLM-generated Content (2024) -
TweepFake: about Detecting Deepfake Tweets (2020) -

External Links:

TweepFake

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview