CAIS

Chinese Artificial Intelligence Speakers

Dataset Information
Modalities
Texts
Languages
Chinese
Introduced
2019
License
Unknown
Homepage

Overview

We collect utterances from the Chinese Artificial Intelligence Speakers (CAIS), and annotate them with slot tags and intent labels. The training, validation and test sets are split by the distribution of intents, where detailed statistics are provided in the supplementary material. Since the utterances are collected from speaker systems in the real world, intent labels are partial to the PlayMusic option. We adopt the BIOES tagging scheme for slots instead of the BIO2 used in the ATIS, since previous studies have highlighted meaningful improvements with this scheme (Ratinov and Roth, 2009) in the sequence labeling field

Variants: CAIS

Associated Benchmarks

This dataset is used in 2 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Slot Filling CM-Net CM-Net: A Novel Collaborative Memory … 2019-09-16
Intent Detection CM-Net CM-Net: A Novel Collaborative Memory … 2019-09-16

Research Papers

Recent papers with results on this dataset: