GERMS-AT: A Sexism/Misogyny Dataset of Forum Comments from an Austrian Online Newspaper
This dataset contains 7984 user comments from an Austrian online newspaper. The comments have been annotated by 4 or more out of 11 annotators as to how strong sexism/mysogyny is present in the comment. It was used in the GermEval 2024 Shared Task 1: GerMS-Detect to evaluate data-driven approaches to automatically detect sexism in user comments.
The dataset was introduced and described in Brigitte Krenn, Johann Petrak, Marina Kubina, and Christian Burger. 2024. Germs-at: A sexism/misogyny dataset of forum comments from an Austrian online newspaper. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 7728–7739.
The data can be downloaded here: https://huggingface.co/datasets/ofai/GerMS-AT
Variants: GerMS-AT
This dataset is used in 2 benchmarks:
Task | Model | Paper | Date |
---|---|---|---|
GermEval2024 Shared Task 1 Subtask 1 | mE5-large-SVM | Detecting Sexism in German Online … | 2024-09-16 |
GermEval2024 Shared Task 1 Subtask 2 | GBERT-large-SVM | Detecting Sexism in German Online … | 2024-09-16 |
Recent papers with results on this dataset: