CVSS

Dataset Information
Modalities
Texts, Audio, Speech
Languages
English, French, Spanish, German, Italian, Chinese, Japanese, Russian, Portuguese, Arabic, Catalan, Dutch, Estonian, Indonesian, Latvian, Persian, Slovenian, Swedish, Tamil, Turkish, Welsh, Mongolian
Introduced
2022
License
Homepage

Overview

CVSS is a massively multilingual-to-English speech to speech translation (S2ST) corpus, covering sentence-level parallel S2ST pairs from 21 languages into English. CVSS is derived from the Common Voice speech corpus and the CoVoST 2 speech-to-text translation (ST) corpus, by synthesizing the translation text from CoVoST 2 into speech using state-of-the-art TTS systems

Variants: CVSS

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Speech-to-Speech Translation SeamlessM4T Large SeamlessM4T: Massively Multilingual & Multimodal … 2023-08-22
Speech-to-Speech Translation SeamlessM4T Medium SeamlessM4T: Massively Multilingual & Multimodal … 2023-08-22

Research Papers

Recent papers with results on this dataset: