VOCASET is a 4D face dataset with about 29 minutes of 4D scans captured at 60 fps and synchronized audio. The dataset has 12 subjects and 480 sequences of about 3-4 seconds each with sentences chosen from an array of standard protocols that maximize phonetic diversity.
Source: timzhang642
Variants: VOCASET
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
3D Face Animation | FaceFormer | FaceFormer: Speech-Driven 3D Facial Animation … | 2021-12-10 |
3D Face Animation | MeshTalk | MeshTalk: 3D Face Animation from … | 2021-04-16 |
Recent papers with results on this dataset: