CREMA-D

Dataset Information
Modalities
Audio
License
Unknown
Homepage

Overview

CREMA-D is an emotional multimodal actor data set of 7,442 original clips from 91 actors. These clips were from 48 male and 43 female actors between the ages of 20 and 74 coming from a variety of races and ethnicities (African America, Asian, Caucasian, Hispanic, and Unspecified).

Actors spoke from a selection of 12 sentences. The sentences were presented using one of six different emotions (Anger, Disgust, Fear, Happy, Neutral, and Sad) and four different emotion levels (Low, Medium, High, and Unspecified).

Participants rated the emotion and emotion levels based on the combined audiovisual presentation, the video alone, and the audio alone. Due to the large number of ratings needed, this effort was crowd-sourced and a total of 2443 participants each rated 90 unique clips, 30 audio, 30 visual, and 30 audio-visual. 95% of the clips have more than 7 ratings.

Variants: CREMA-D

Associated Benchmarks

This dataset is used in 5 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Speech Emotion Recognition Vertically long patch ViT Accuracy enhancement method for speech … 2024-03-26
Talking Face Generation EmoGen Emotionally Enhanced Talking Face Generation 2023-03-21
Audio Classification EfficientLEAF EfficientLEAF: A Faster LEarnable Audio … 2022-07-12
Audio Classification melspect EfficientLEAF: A Faster LEarnable Audio … 2022-07-12
Audio Classification LEAF EfficientLEAF: A Faster LEarnable Audio … 2022-07-12
Self-Supervised Learning Hybrid BYOL-S/CvT BYOL-S: Learning Self-supervised Speech Representations … 2022-06-24
Speech Emotion Recognition SepTr + LeRaC Learning Rate Curriculum 2022-05-18
Speech Emotion Recognition SepTr SepTr: Separable Transformer for Audio … 2022-03-17
Speech Emotion Recognition ConformerXL-P BigSSL: Exploring the Frontier of … 2021-09-27
Speech Emotion Recognition ViT AST: Audio Spectrogram Transformer 2021-04-05
Speech Emotion Recognition ResNet-18 + SPEL Self-paced ensemble learning for speech … 2021-03-22
Speech Emotion Recognition ResNet-18 + PyNADA Non-linear Neurons with Human-like Apical … 2020-02-02
Speech Emotion Recognition GRU Visually Guided Self Supervised Learning … 2020-01-13

Research Papers

Recent papers with results on this dataset: