ML Research Wiki / Benchmarks / Visual Speech Recognition / LRS3-TED

LRS3-TED

Visual Speech Recognition Benchmark

Performance Over Time

📊 Showing 3 results | 📏 Metric: Word Error Rate (WER)

Rank	Model	Paper	Word Error Rate (WER)	Date	Code
1	VTP 📚	Sub-word Level Lip Reading With Visual Attention	40.60	2021-10-14	-
2	VTP with more data 📚	Sub-word Level Lip Reading With Visual Attention	30.70	2021-10-14	-
3	CTC/Attention 📚	Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels	19.10	2023-03-25	📦 mpc001/auto_avsr 📦 umbertocappellazzo/llama-avsr

2021

VTP

2021

VTP with more data

2023

CTC/Attention

mpc001/auto_avsr umbertocappellazzo/llama-avsr

Model	Paper	Word Error Rate (WER)	Date
VTP	Sub-word Level Lip Reading With Visual Attention	40.60	2021-10-14
VTP with more data	Sub-word Level Lip Reading With Visual Attention	30.70	2021-10-14
CTC/Attention	Auto-AVSR: Audio-Visual Speech Recognition with A…	19.10	2023-03-25