ML Research Wiki / Benchmarks / Bias Detection / StereoSet

StereoSet

Bias Detection Benchmark

Performance Over Time

📊 Showing 11 results | 📏 Metric: ICAT Score

Top Performing Models

Rank Model Paper ICAT Score Date Code
1 GPT-2 (small) StereoSet: Measuring stereotypical bias in pretrained language models 72.97 2020-04-20 📦 moinnadeem/StereoSet 📦 kanekomasahiro/evaluate_bias_in_mlm 📦 zalkikar/mlm-bias
2 XLNet (large) StereoSet: Measuring stereotypical bias in pretrained language models 72.03 2020-04-20 📦 moinnadeem/StereoSet 📦 kanekomasahiro/evaluate_bias_in_mlm 📦 zalkikar/mlm-bias
3 GPT-2 (medium) StereoSet: Measuring stereotypical bias in pretrained language models 71.73 2020-04-20 📦 moinnadeem/StereoSet 📦 kanekomasahiro/evaluate_bias_in_mlm 📦 zalkikar/mlm-bias
4 BERT (base) StereoSet: Measuring stereotypical bias in pretrained language models 71.21 2020-04-20 📦 moinnadeem/StereoSet 📦 kanekomasahiro/evaluate_bias_in_mlm 📦 zalkikar/mlm-bias
5 GPT-2 (large) StereoSet: Measuring stereotypical bias in pretrained language models 70.54 2020-04-20 📦 moinnadeem/StereoSet 📦 kanekomasahiro/evaluate_bias_in_mlm 📦 zalkikar/mlm-bias
6 BERT (large) StereoSet: Measuring stereotypical bias in pretrained language models 69.89 2020-04-20 📦 moinnadeem/StereoSet 📦 kanekomasahiro/evaluate_bias_in_mlm 📦 zalkikar/mlm-bias
7 RoBERTa (base) StereoSet: Measuring stereotypical bias in pretrained language models 67.50 2020-04-20 📦 moinnadeem/StereoSet 📦 kanekomasahiro/evaluate_bias_in_mlm 📦 zalkikar/mlm-bias
8 GAL 120B Galactica: A Large Language Model for Science 65.60 2022-11-16 📦 paperswithcode/galai
9 XLNet (base) StereoSet: Measuring stereotypical bias in pretrained language models 62.10 2020-04-20 📦 moinnadeem/StereoSet 📦 kanekomasahiro/evaluate_bias_in_mlm 📦 zalkikar/mlm-bias
10 GPT-3 (text-davinci-002) Galactica: A Large Language Model for Science 60.80 2022-11-16 📦 paperswithcode/galai

All Papers (11)