OpenSLR

Name: OpenSLR
Published: 2020-05-22
License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

Open Speech and Language Resources

Dataset Information

Modalities

Texts, Audio

Languages

English, Spanish, Bengali, Afrikaans, Basque, Catalan, Galician, Marathi, Tamil, Telugu, Yoruba, Gujarati, Javanese, Kannada, Central Khmer, Malayalam, Burmese, Nepali (individual language), Sinhala, Sundanese, Venda, Xhosa

Introduced

2020

License

Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

OpenSLR is a repository of open speech and language resources, including large-scale transcribed audio corpora and related software. It serves as a central platform for researchers and practitioners to access and share datasets used in speech recognition (ASR), text-to-speech (TTS), and linguistic research.

The OpenSLR collection includes over 30 diverse datasets spanning more than 25 languages, such as Javanese, Nepali, Malayalam, Yoruba, and various English and Spanish dialects. These datasets are contributed by institutions including Google, North West University, and others. Most contain audio recordings along with transcriptions, covering both crowd-sourced and professionally recorded material.

Many of the datasets are high-quality multi-speaker corpora intended for use in building ASR and TTS models, particularly for under-resourced languages. Use cases include multilingual speech recognition, dialect modeling, language technology research, and building open-source voice applications.

The OpenSLR site also acts as a mirror for widely-used tools and models to ensure continued availability.

Website: OpenSLR website

Source: openslr/openslr

Variants: OpenSLR

Associated Benchmarks

This dataset is used in 1 benchmark:

Speech Recognition - Metrics: Test WER

Recent Benchmark Submissions

No recent benchmark submissions available for this dataset.

Research Papers

No papers with results on this dataset found.

External Links:

OpenSLR

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview