LRS3-TED

Dataset Information
Modalities
Videos
Homepage

Overview

LRS3-TED is a multi-modal dataset for visual and audio-visual speech recognition. It includes face tracks from over 400 hours of TED and TEDx videos, along with the corresponding subtitles and word alignment boundaries. The new dataset is substantially larger in scale compared to other public datasets that are available for general research.

Source: LRS3-TED: a large-scale dataset for visual speech recognition

Variants: LRS3-TED

Associated Benchmarks

This dataset is used in 6 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Audio-Visual Speech Recognition MMS-LLaMA MMS-LLaMA: Efficient LLM-based Audio-Visual Speech … 2025-03-14
Audio-Visual Speech Recognition Zero-AVSR Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition … 2025-03-08
Lipreading DistillAV Audio-Visual Representation Learning via Knowledge … 2025-02-09
Automatic Speech Recognition (ASR) DistillAV Audio-Visual Representation Learning via Knowledge … 2025-02-09
Audio-Visual Speech Recognition DistillAV Audio-Visual Representation Learning via Knowledge … 2025-02-09
Lipreading USR (self-supervised) Unified Speech Recognition: A Single … 2024-11-04
Lipreading USR (self + semi-supervised) Unified Speech Recognition: A Single … 2024-11-04
Audio-Visual Speech Recognition Llama-AVSR Large Language Models are Strong … 2024-09-18
Speech Recognition Llama-AVSR Large Language Models are Strong … 2024-09-18
Lipreading SyncVSR SyncVSR: Data-Efficient Visual Speech Recognition … 2024-06-18
Lipreading SyncVSR SyncVSR: Data-Efficient Visual Speech Recognition … 2024-06-18
Audio-Visual Speech Recognition Whisper-Flamingo Whisper-Flamingo: Integrating Visual Features into … 2024-06-14
Speech Recognition Whisper Whisper-Flamingo: Integrating Visual Features into … 2024-06-14
Lipreading VSP-LLM Where Visual Speech Meets Language: … 2024-02-23
Active Speaker Detection GestSync GestSync: Determining who is speaking … 2023-10-08
Automatic Speech Recognition (ASR) CTC/Attention Auto-AVSR: Audio-Visual Speech Recognition with … 2023-03-25
Visual Speech Recognition CTC/Attention Auto-AVSR: Audio-Visual Speech Recognition with … 2023-03-25
Lipreading Auto-AVSR Auto-AVSR: Audio-Visual Speech Recognition with … 2023-03-25
Audio-Visual Speech Recognition CTC/Attention Auto-AVSR: Audio-Visual Speech Recognition with … 2023-03-25
Lipreading LP + Conformer Conformers are All You Need … 2023-02-17

Research Papers

Recent papers with results on this dataset: