SHAJ

Name: SHAJ
Published: 2021-07-28
License: CC-BY 4.0

Spoken Hate in the Albanian Jargon

Dataset Information

Modalities

Texts

Languages

Albanian

Introduced

2021

License

CC-BY 4.0

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

This is an abusive/offensive language detection dataset for Albanian. The data is formatted following the OffensEval convention. Data is from Instagram and YouTube comments.

Variants: SHAJ

Associated Benchmarks

This dataset is used in 1 benchmark:

Hate Speech Detection - Metrics: F1

Recent Benchmark Submissions

Task	Model	Paper	Date
Hate Speech Detection	Baseline BERT (task A)	Detecting Abusive Albanian	2021-07-28

Research Papers

Recent papers with results on this dataset:

Detecting Abusive Albanian (2021) -

External Links:

SHAJ

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview