Red Teaming Innodata Bias
The Innodata Red Teaming Prompts aims to rigorously assess models’ factuality and safety. This dataset, due to its manual creation and breadth of coverage, facilitates a comprehensive examination of LLM performance across diverse scenarios.
Variants: rt-inod-bias
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Bias Detection | GPT-4 | Benchmarking Llama2, Mistral, Gemma and … | 2024-04-15 |
Bias Detection | Gemma | Benchmarking Llama2, Mistral, Gemma and … | 2024-04-15 |
Bias Detection | Baseline | Benchmarking Llama2, Mistral, Gemma and … | 2024-04-15 |
Bias Detection | Mistral | Benchmarking Llama2, Mistral, Gemma and … | 2024-04-15 |
Bias Detection | Llama2 | Benchmarking Llama2, Mistral, Gemma and … | 2024-04-15 |
Recent papers with results on this dataset: