Lip Reading in the Wild
The Lip Reading in the Wild (LRW) dataset a large-scale audio-visual database that contains 500 different words from over 1,000 speakers. Each utterance has 29 frames, whose boundary is centered around the target word. The database is divided into training, validation and test sets. The training set contains at least 800 utterances for each class while the validation and test sets contain 50 utterances.
Source: Towards Pose-invariant Lip-Reading
Image Source: https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrw1.html
Variants: LRW, Lip Reading in the Wild, Lipreading in the Wild
This dataset is used in 4 benchmarks:
Task | Model | Paper | Date |
---|---|---|---|
Lip to Speech Synthesis | Lip2Wav | Learning Individual Speaking Styles for … | 2020-05-17 |
Lip Reading | Lip2Wav | Learning Individual Speaking Styles for … | 2020-05-17 |
Talking Face Generation | LipGAN | Towards Automatic Face-to-Face Translation | 2020-03-01 |
Recent papers with results on this dataset: