TweetQA

Name: TweetQA
Published: 2019-07-14
License: Unknown

Dataset Information

Modalities

Texts

Introduced

2019

License

Unknown

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

With social media becoming increasingly popular on which lots of news and real-time events are reported, developing automated question answering systems is critical to the effectiveness of many applications that rely on real-time knowledge. While previous question answering (QA) datasets have concentrated on formal text like news and Wikipedia, the first large-scale dataset for QA over social media data is presented. To make sure the tweets are meaningful and contain interesting information, tweets used by journalists to write news articles are gathered. Then human annotators are asked to write questions and answers upon these tweets. Unlike other QA datasets like SQuAD in which the answers are extractive, the answer are allowed to be abstractive. The task requires model to read a short tweet and a question and outputs a text phrase (does not need to be in the tweet) as the answer.

Source: TWEETQA: A Social Media Focused Question Answering Dataset

Variants: TweetQA

Associated Benchmarks

This dataset is used in 1 benchmark:

Question Answering - Metrics: BLEU-1, ROUGE-L

Recent Benchmark Submissions

Task	Model	Paper	Date
Question Answering	ByT5 (small)	ByT5: Towards a token-free future …	2021-05-28
Question Answering	mT5	ByT5: Towards a token-free future …	2021-05-28
Question Answering	ByT5	ByT5: Towards a token-free future …	2021-05-28

Research Papers

Recent papers with results on this dataset:

ByT5: Towards a token-free future with pre-trained byte-to-byte models (2021) -

External Links:

TweetQA

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview