EQ-Bench

Dataset Information

Modalities

Ranking

Languages

English

Introduced

2023

License

MIT

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

This dataset contains benchmark scores for EQ-Bench, a novel benchmark designed to evaluate aspects of emotional intelligence in Large Language Models (LLMs). We assess the ability of LLMs to understand complex emotions and social interactions by asking them to predict the intensity of emotional states of characters in a dialogue. The benchmark is able to discriminate effectively between a wide range of models. We find that EQ-Bench correlates strongly with comprehensive multi-domain benchmarks like MMLU (Hendrycks et al., 2020) (r=0.97), indicating that we may be capturing similar aspects of broad intelligence. Our benchmark produces highly repeatable results using a set of 60 English-language questions. We also provide open-source code for an automated benchmarking pipeline at https://github.com/EQ-bench/EQ-Bench and a leaderboard at https://www.eqbench.com.

Variants: EQ-Bench

Associated Benchmarks

This dataset is used in 1 benchmark:

Emotional Intelligence - Metrics: EQ-Bench Score

Recent Benchmark Submissions

Task	Model	Paper	Date
Emotional Intelligence	OpenAI gpt-4-0613	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	migtissera/SynthIA-70B-v1.5	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	OpenAI gpt-4-0314	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	Qwen/Qwen-72B-Chat	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	Anthropic Claude2	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	meta-llama/Llama-2-70b-chat-hf	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	01-ai/Yi-34B-Chat	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	OpenAI gpt-3.5-0613	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	OpenAI gpt-3.5-turbo-0301	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	Open-Orca/Mistral-7B-OpenOrca	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	Qwen/Qwen-14B-Chat	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	OpenAI text-davinci-003	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	Intel/neural-chat-7b-v3-1	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	OpenAI text-davinci-002	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	openchat/openchat 3.5	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	lmsys/vicuna-33b-v1.3	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	meta-llama/Llama-2-13b-chat-hf	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	lmsys/vicuna-13b-v1.1	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	meta-llama/Llama-2-7b-chat-hf	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11
Emotional Intelligence	Koala 13B	EQ-Bench: An Emotional Intelligence Benchmark …	2023-12-11

Research Papers

Recent papers with results on this dataset:

EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models (2023) -

External Links:

EQ-Bench

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview