CREMA-D

Name: CREMA-D
License: Unknown

Dataset Information

Modalities

Audio

License

Unknown

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

CREMA-D is an emotional multimodal actor data set of 7,442 original clips from 91 actors. These clips were from 48 male and 43 female actors between the ages of 20 and 74 coming from a variety of races and ethnicities (African America, Asian, Caucasian, Hispanic, and Unspecified).

Actors spoke from a selection of 12 sentences. The sentences were presented using one of six different emotions (Anger, Disgust, Fear, Happy, Neutral, and Sad) and four different emotion levels (Low, Medium, High, and Unspecified).

Participants rated the emotion and emotion levels based on the combined audiovisual presentation, the video alone, and the audio alone. Due to the large number of ratings needed, this effort was crowd-sourced and a total of 2443 participants each rated 90 unique clips, 30 audio, 30 visual, and 30 audio-visual. 95% of the clips have more than 7 ratings.

Variants: CREMA-D

Associated Benchmarks

This dataset is used in 5 benchmarks:

Audio Classification - Metrics: Accuracy
Facial Expression Recognition (FER) - Metrics: UAR
Self-Supervised Learning - Metrics: Accuracy
Talking Face Generation - Metrics: EmoAcc, FID, LSE-C
Speech Emotion Recognition - Metrics: Accuracy

Recent Benchmark Submissions

Task	Model	Paper	Date
Speech Emotion Recognition	Vertically long patch ViT	Accuracy enhancement method for speech …	2024-03-26
Talking Face Generation	EmoGen	Emotionally Enhanced Talking Face Generation	2023-03-21
Audio Classification	EfficientLEAF	EfficientLEAF: A Faster LEarnable Audio …	2022-07-12
Audio Classification	melspect	EfficientLEAF: A Faster LEarnable Audio …	2022-07-12
Audio Classification	LEAF	EfficientLEAF: A Faster LEarnable Audio …	2022-07-12
Self-Supervised Learning	Hybrid BYOL-S/CvT	BYOL-S: Learning Self-supervised Speech Representations …	2022-06-24
Speech Emotion Recognition	SepTr + LeRaC	Learning Rate Curriculum	2022-05-18
Speech Emotion Recognition	SepTr	SepTr: Separable Transformer for Audio …	2022-03-17
Speech Emotion Recognition	ConformerXL-P	BigSSL: Exploring the Frontier of …	2021-09-27
Speech Emotion Recognition	ViT	AST: Audio Spectrogram Transformer	2021-04-05
Speech Emotion Recognition	ResNet-18 + SPEL	Self-paced ensemble learning for speech …	2021-03-22
Speech Emotion Recognition	ResNet-18 + PyNADA	Non-linear Neurons with Human-like Apical …	2020-02-02
Speech Emotion Recognition	GRU	Visually Guided Self Supervised Learning …	2020-01-13

Research Papers

Recent papers with results on this dataset:

External Links:

CREMA-D

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview