HateMM

Name: HateMM
Published: 2023-04-04
License: Creative Commons Attribution 4.0 International

Dataset Information

Modalities

Videos, Audio

Languages

English

Introduced

2023

License

Creative Commons Attribution 4.0 International

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

Hate speech has become one of the most significant issues in modern society, with implications in both the online and offline worlds. However, most of the work has primarily focused on text media, with relatively little work on images and even less on videos. Thus, early-stage automated video moderation techniques are needed to handle the videos that are being uploaded to keep the platform safe and healthy. Therefore, we curated approximately ~43 hours of videos from BitChute and manually annotated them as hate or non-hate, along with the frame spans that could explain the labeling decision.

This upload contains the labeled video dataset crawled from BitChute, as described in the paper 'HateMM: A Multi-modal Dataset for Hate Video Classification'.

Variants: HateMM

Associated Benchmarks

This dataset is used in 1 benchmark:

Hate Speech Detection - Metrics: TEST F1 (macro)

Recent Benchmark Submissions

Task	Model	Paper	Date
Hate Speech Detection	HXP + CLAP + CLIP	Towards a Robust Framework for …	2025-02-11
Hate Speech Detection	BERT + ViT + MFCC	HateMM: A Multi-Modal Dataset for …	2023-05-06

Research Papers

Recent papers with results on this dataset:

External Links:

HateMM

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview