FLoRes-200

Dataset Information
Modalities
Texts
Languages
English, Arabic
Introduced
2022
License
Unknown
Homepage

Overview

FLoRes-200 doubles the existing language coverage of FLoRes-101. Given the nature of the new languages, which have less standardization and require more specialized professional translations, the verification process became more complex. This required modifications to the translation workflow. FLoRes-200 has several languages which were not translated from English. Specifically, several languages were translated from Spanish, French, Russian, and Modern Standard Arabic.

Variants: FLoRes-200, flores200-dev, flores200-devtest

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Machine Translation GenTranslate-7B GenTranslate: Large Language Models are … 2024-02-10
Machine Translation ALMA-13B A Paradigm Shift in Machine … 2023-09-20
Machine Translation SeamlessM4T-Large-V1 SeamlessM4T: Massively Multilingual & Multimodal … 2023-08-22
Machine Translation BigTranslate BigTranslate: Augmenting Large Language Models … 2023-05-29
Machine Translation NLLB-3.3B No Language Left Behind: Scaling … 2022-07-11

Research Papers

Recent papers with results on this dataset: