AliMeeting

Name: AliMeeting
Published: 2021-10-14
License: MIT License

Multi-Channel Multi-Party Meeting Transcription Challenge

Dataset Information

Modalities

Audio

Languages

Chinese

Introduced

2021

License

MIT License

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

AliMeeting corpus consists of 120 hours of recorded Mandarin meeting data, including far-field data collected by 8-channel microphone array as well as near-field data collected by headset microphone. Each meeting session is composed of 2-4 speakers with different speaker overlap ratio, recorded in rooms with different size.

Variants: AliMeeting

Associated Benchmarks

This dataset is used in 1 benchmark:

Speaker Diarization - Metrics: DER(%)

Recent Benchmark Submissions

Task	Model	Paper	Date
Speaker Diarization	SOND	Speaker Embedding-aware Neural Diarization: an …	2022-03-18

Research Papers

Recent papers with results on this dataset:

Speaker Embedding-aware Neural Diarization: an Efficient Framework for Overlapping Speech Diarization in Meeting Scenarios (2022) -

External Links:

AliMeeting

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview