AISHELL-1

Dataset Information
Modalities
Speech
Languages
Mandarin Chinese
License
Homepage

Overview

AISHELL-1 is a corpus for speech recognition research and building speech recognition systems for Mandarin.

Source: AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline

Variants: AISHELL-1

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Speech Recognition FireRedASR-AED FireRedASR: Open-Source Industrial-Grade Mandarin Speech … 2025-01-24
Speech Recognition Zipformer+CR-CTC (no external language model) CR-CTC: Consistency regularization on CTC … 2024-10-07
Speech Recognition Lightweight Transducer Lightweight Transducer Based on Frame-Level … 2024-09-05
Speech Recognition Lightweight Transducer With LM Lightweight Transducer Based on Frame-Level … 2024-09-05
Speech Recognition Seed-ASR Seed-ASR: Understanding Diverse Speech and … 2024-07-05
Speech Recognition Qwen-Audio Qwen-Audio: Advancing Universal Audio Understanding … 2023-11-14
Speech Recognition UMA Unimodal Aggregation for CTC-based Speech … 2023-09-15
Speech Recognition BAT BAT: Boundary aware transducer for … 2023-05-19
Speech Recognition Paraformer FunASR: A Fundamental End-to-End Speech … 2023-05-18
Speech Recognition Paraformer-large FunASR: A Fundamental End-to-End Speech … 2023-05-18
Speech Recognition BRA-E Beyond Universal Transformer: block reusing … 2023-03-23
Speech Recognition CIF-HKD With LM Knowledge Transfer from Pre-trained Language … 2023-01-30
Speech Recognition MMSpeech With LM MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training … 2022-11-29
Speech Recognition SE-WSBO With LM Improving Mandarin Speech Recogntion with … 2022-07-24
Speech Recognition U2 Unified Streaming and Non-streaming Two-pass … 2020-12-10
Speech Recognition CTC-CRF 4gram-LM CAT: A CTC-CRF based ASR … 2020-05-27
Speech Recognition CTC/Att A Comparative Study on Transformer … 2019-09-13
Speech Recognition Att End-to-end Speech Recognition with Adaptive … 2018-08-30

Research Papers

Recent papers with results on this dataset: