ML Research Wiki / Benchmarks / Audio-Visual Speech Recognition / LRS3-TED

LRS3-TED

Audio-Visual Speech Recognition Benchmark

Performance Over Time

📊 Showing 12 results | 📏 Metric: Word Error Rate (WER)

Top Performing Models

Rank Model Paper Word Error Rate (WER) Date Code
1 TM-seq2seq 📚 Deep Audio-Visual Speech Recognition 7.20 2018-09-06 📦 lordmartian/deep_avsr 📦 smeetrs/deep_avsr 📦 exgc/avmust-ted 📦 amitai1992/AutomatedLipReading
2 EG-seq2seq 📚 Discriminative Multi-modality Speech Recognition 6.80 2020-05-12 📦 JackSyu/Discriminative-Multi-modality-Speech-Recognition 📦 JackSyu/AE-MSR
3 RNN-T 📚 Recurrent Neural Network Transducer for Audio-Visual Speech Recognition 4.50 2019-11-08 📦 around-star/Speech-Recognition
4 Hyb-Conformer End-to-end Audio-visual Speech Recognition with Conformers 2.30 2021-02-12 📦 zziz/pwc 📦 mpc001/Visual_Speech_Recognition_for_Multiple_Languages 📦 mpc001/auto_avsr
5 Zero-AVSR 📚 Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations 1.50 2025-03-08 📦 JeongHun0716/zero-avsr
6 AV-HuBERT Large 📚 Robust Self-Supervised Audio-Visual Speech Recognition 1.40 2022-01-05 📦 facebookresearch/av_hubert
7 RAVEn Large 📚 Jointly Learning Visual and Auditory Speech Representations from Raw Data 1.40 2022-12-12 📦 ahaliassos/raven
8 DistillAV 📚 Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation Models 1.30 2025-02-09 📦 jxzhanggg/DistillAV
9 CTC/Attention 📚 Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels 0.90 2023-03-25 📦 mpc001/auto_avsr 📦 umbertocappellazzo/llama-avsr
10 Llama-AVSR 📚 Large Language Models are Strong Audio-Visual Speech Recognition Learners 0.77 2024-09-18 📦 umbertocappellazzo/llama-avsr

All Papers (12)