CPED

Chinese Personalized and Emotional Dialogue

Dataset Information
Modalities
Videos, Texts, Audio
Languages
Chinese
Introduced
2022
License
Homepage

Overview

We construct a dataset named CPED from 40 Chinese TV shows. CPED consists of multisource knowledge related to empathy and personal characteristic. This knowledge covers 13 emotions, gender, Big Five personality traits, 19 dialogue acts and other knowledge.

  • We build a multiturn Chinese Personalized and Emotional Dialogue dataset called CPED. To the best of our knowledge, CPED is the first Chinese personalized and emotional dialogue dataset. CPED contains 12K dialogues and 133K utterances with multi-modal context. Therefore, it can be used in both complicated dialogue understanding and human-like conversation generation.
  • CPED has been annotated with 3 character attributes (name, gender age), Big Five personality traits, 2 types of dynamic emotional information (sentiment and emotion) and DAs. The personality traits and emotions can be used as prior external knowledge for open-domain conversation generation, making the conversation system have a good command of personification capabilities.
  • We propose three tasks for CPED: personality recognition in conversations (PRC), emotion recognition in conversations (ERC), and personalized and emotional conversation (PEC). A set of experiments verify the importance of using personalities and emotions as prior external knowledge for conversation generation.

Variants: CPED

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Personality Recognition in Conversation BERT$_{ssenet}^{c}$ CPED: A Large-Scale Chinese Personalized … 2022-05-29
Personality Recognition in Conversation BERT$^{s}$ CPED: A Large-Scale Chinese Personalized … 2022-05-29
Personality Recognition in Conversation BERT$^{c}$ CPED: A Large-Scale Chinese Personalized … 2022-05-29
Personality Recognition in Conversation BERT$_{senet}^{c}$ CPED: A Large-Scale Chinese Personalized … 2022-05-29

Research Papers

Recent papers with results on this dataset: