LibriTTS

Name: LibriTTS
Published: 2019-04-05
License: Unknown

Dataset Information

Modalities

Texts, Audio, Speech

Languages

English

Introduced

2019

License

Unknown

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

LibriTTS is a multi-speaker English corpus of approximately 585 hours of read English speech at 24kHz sampling rate, prepared by Heiga Zen with the assistance of Google Speech and Google Brain team members. The LibriTTS corpus is designed for TTS research. It is derived from the original materials (mp3 audio files from LibriVox and text files from Project Gutenberg) of the LibriSpeech corpus. The main differences from the LibriSpeech corpus are listed below:

The audio files are at 24kHz sampling rate.
The speech is split at sentence breaks.
Both original and normalized texts are included.
Contextual information (e.g., neighbouring sentences) can be extracted.
Utterances with significant background noise are excluded.

Variants: LibriTTS

Associated Benchmarks

This dataset is used in 1 benchmark:

Speech Synthesis - Metrics: PESQ, M-STFT, MCD, Periodicity, V/UV F1

Recent Benchmark Submissions

Task	Model	Paper	Date
Speech Synthesis	PeriodWave-Turbo-L	Accelerating High-Fidelity Waveform Generation via …	2024-08-15
Speech Synthesis	PeriodWave + FreeU	PeriodWave: Multi-Period Flow Matching for …	2024-08-14
Speech Synthesis	RFWave	RFWave: Multi-band Rectified Flow for …	2024-03-08
Speech Synthesis	EVA-GAN-base	EVA-GAN: Enhanced Various Audio Generation …	2024-01-31
Speech Synthesis	EVA-GAN-big	EVA-GAN: Enhanced Various Audio Generation …	2024-01-31
Speech Synthesis	BigVSAN	BigVSAN: Enhancing GAN-based Neural Vocoders …	2023-09-06
Speech Synthesis	BigVSAN (w/ snakebeta)	BigVSAN: Enhancing GAN-based Neural Vocoders …	2023-09-06
Speech Synthesis	Vocos	Vocos: Closing the gap between …	2023-06-01
Speech Synthesis	BigVGAN-base	BigVGAN: A Universal Neural Vocoder …	2022-06-09
Speech Synthesis	BigVGAN-v2	BigVGAN: A Universal Neural Vocoder …	2022-06-09
Speech Synthesis	BigVGAN	BigVGAN: A Universal Neural Vocoder …	2022-06-09
Speech Synthesis	HiFi-GAN	HiFi-GAN: Generative Adversarial Networks for …	2020-10-12
Speech Synthesis	SC-WaveRNN	Speaker Conditional WaveRNN: Towards Universal …	2020-08-09
Speech Synthesis	WaveFlow	WaveFlow: A Compact Flow-based Model …	2019-12-03
Speech Synthesis	WaveGlow	WaveGlow: A Flow-based Generative Network …	2018-10-31

Research Papers

Recent papers with results on this dataset:

External Links:

LibriTTS

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview