TM-CTC
|
Deep Audio-Visual Speech Recognition
|
10.10
|
2018-09-06
|
|
TM-seq2seq
|
Deep Audio-Visual Speech Recognition
|
9.70
|
2018-09-06
|
|
CTC/attention
|
Audio-Visual Speech Recognition With A Hybrid CTC…
|
8.20
|
2018-09-28
|
|
LF-MMI TDNN
|
Audio-visual Recognition of Overlapped speech for…
|
6.70
|
2020-01-06
|
|
Whisper-LLaMA
|
Whispering LLaMA: A Cross-Modal Generative Error …
|
6.60
|
2023-10-10
|
|
End2end Conformer
|
End-to-end Audio-visual Speech Recognition with C…
|
3.90
|
2021-02-12
|
|
MoCo + wav2vec (w/o extLM)
|
Leveraging Unimodal Self-Supervised Learning for …
|
2.70
|
2022-02-24
|
|
CTC/Attention
|
Auto-AVSR: Audio-Visual Speech Recognition with A…
|
1.50
|
2023-03-25
|
|
Whisper
|
Whisper-Flamingo: Integrating Visual Features int…
|
1.30
|
2024-06-14
|
|