SHAJ

Spoken Hate in the Albanian Jargon

Dataset Information
Modalities
Texts
Languages
Albanian
Introduced
2021
License
CC-BY 4.0
Homepage

Overview

This is an abusive/offensive language detection dataset for Albanian. The data is formatted following the OffensEval convention. Data is from Instagram and YouTube comments.

Variants: SHAJ

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Hate Speech Detection Baseline BERT (task A) Detecting Abusive Albanian 2021-07-28

Research Papers

Recent papers with results on this dataset: