TweepFake

Dataset Information
Modalities
Texts
Languages
English
Introduced
2020
License
MIT
Homepage

Overview

The TweepFake dataset consists of 25,572 social media messages posted either by bots or humans on Twitter. Each bot imitated a human account and was based on various generative techniques, including Markov Chains, RNN, RNN+Markov, LSTM, and GPT-2.

Variants: TweepFake

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Binary text classification GigaCheck (Mistral-7B) GigaCheck: Detecting LLM-generated Content 2024-10-31
Binary text classification XLNet TweepFake: about Detecting Deepfake Tweets 2020-07-31

Research Papers

Recent papers with results on this dataset: