HateMM

Dataset Information
Modalities
Videos, Audio
Languages
English
Introduced
2023
Homepage

Overview

Hate speech has become one of the most significant issues in modern society, with implications in both the online and offline worlds. However, most of the work has primarily focused on text media, with relatively little work on images and even less on videos. Thus, early-stage automated video moderation techniques are needed to handle the videos that are being uploaded to keep the platform safe and healthy. Therefore, we curated approximately ~43 hours of videos from BitChute and manually annotated them as hate or non-hate, along with the frame spans that could explain the labeling decision.

This upload contains the labeled video dataset crawled from BitChute, as described in the paper 'HateMM: A Multi-modal Dataset for Hate Video Classification'.

Variants: HateMM

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Hate Speech Detection HXP + CLAP + CLIP Towards a Robust Framework for … 2025-02-11
Hate Speech Detection BERT + ViT + MFCC HateMM: A Multi-Modal Dataset for … 2023-05-06

Research Papers

Recent papers with results on this dataset: