IEMOCAP

The Interactive Emotional Dyadic Motion Capture (IEMOCAP) Database

Dataset Information
Modalities
Videos, Audio
Languages
English
Homepage

Overview

Multimodal Emotion Recognition IEMOCAP The IEMOCAP dataset consists of 151 videos of recorded dialogues, with 2 speakers per session for a total of 302 videos across the dataset. Each segment is annotated for the presence of 9 emotions (angry, excited, fear, sad, surprised, frustrated, happy, disappointed and neutral) as well as valence, arousal and dominance. The dataset is recorded across 5 sessions with 5 pairs of speakers.

Source: Multi-attention Recurrent Network for Human Communication Comprehension
Image Source: https://sail.usc.edu/iemocap/Busso_2008_iemocap.pdf

Variants: IEMOCAP

Associated Benchmarks

This dataset is used in 2 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Multimodal Emotion Recognition GraphSmile Tracing Intricate Cues in Dialogue: … 2024-07-31
Speech Emotion Recognition emoDARTS emoDARTS: Joint Optimisation of CNN … 2024-03-21
Multimodal Emotion Recognition Joyful Joyful: Joint Modality Fusion and … 2023-11-18
Speech Emotion Recognition CNN - DARTS Enhancing Speech Emotion Recognition Through … 2023-05-23
Speech Emotion Recognition TAP Speaker Normalization for Self-supervised Speech … 2022-02-02
Speech Emotion Recognition Partially Fine-tuned HuBERT Large A Fine-tuned Wav2vec 2.0/HuBERT Benchmark … 2021-11-04
Speech Emotion Recognition CNN+LSTM CNN+LSTM Architecture for Speech Emotion … 2018-02-15

Research Papers

Recent papers with results on this dataset: