TAT

Name: TAT
License: CC BY-NC 4.0

Taiwanese Across Taiwan

Dataset Information

Modalities

Speech

License

CC BY-NC 4.0

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

Taiwanese Across Taiwan (TAT) corpus is a Large-Scale database of Native Taiwanese Article/Reading Speech collected across Taiwan. This corpus contains native Taiwanese speech of various accent across Taiwan. The corpus is annotated twice for use in voice recognition research. The corpus contains recording from 100 native speakers, each with length of 30 minutes making a total of 100 hours of speech data.

Source: https://sites.google.com/speech.ntut.edu.tw/fsw/home/tat-corpus?authuser=0

Image Source: https://sites.google.com/speech.ntut.edu.tw/fsw/home/tat-corpus?authuser=0

Variants: TAT

Associated Benchmarks

This dataset is used in 1 benchmark:

Speech-to-Speech Translation - Metrics: ASR-BLEU (Dev), ASR-BLEU (Test)

Recent Benchmark Submissions

Task	Model	Paper	Date
Speech-to-Speech Translation	Hokkien→En (Two-pass decoding)	Speech-to-Speech Translation For A Real-world …	2022-10-19
Speech-to-Speech Translation	Hokkien→En (Two-stage)	Speech-to-Speech Translation For A Real-world …	2022-10-19
Speech-to-Speech Translation	Hokkien→En (Three-stage)	Speech-to-Speech Translation For A Real-world …	2022-10-19
Speech-to-Speech Translation	Hokkien→En (Single-pass decoding)	Speech-to-Speech Translation For A Real-world …	2022-10-19
Speech-to-Speech Translation	En→Hokkien (Two-pass decoding)	Speech-to-Speech Translation For A Real-world …	2022-10-19
Speech-to-Speech Translation	En→Hokkien (Three-stage)	Speech-to-Speech Translation For A Real-world …	2022-10-19
Speech-to-Speech Translation	En→Hokkien (Two-stage)	Speech-to-Speech Translation For A Real-world …	2022-10-19
Speech-to-Speech Translation	En→Hokkien (Single-pass decoding)	Speech-to-Speech Translation For A Real-world …	2022-10-19

Research Papers

Recent papers with results on this dataset:

Speech-to-Speech Translation For A Real-world Unwritten Language (2022) -

External Links:

TAT

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview