tweetSentBR

Name: tweetSentBR
Published: 2017-12-24
License: Unknown

Dataset Information

Introduced

2017

License

Unknown

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

The TweetSentBR Dataset is a valuable resource for sentiment analysis in Brazilian Portuguese. Let me provide you with some details about it:

Description:
- The dataset consists of 15,000 manually annotated sentences extracted from tweets in Brazilian Portuguese.
- These sentences are specifically related to the TV show domain.
- Each sentence has been labeled into one of three classes: positive, neutral, or negative sentiment.
- The annotation process followed literature guidelines to ensure reliability.
Purpose:
- Researchers and practitioners in the field of Natural Language Processing (NLP) use this dataset for sentiment analysis tasks.
- It serves as a benchmark for developing and evaluating novel methods and approaches for sentiment classification.
Performance:
- Baseline experiments on polarity classification using three machine learning methods achieved the following results:
- Binary classification (positive vs. negative): 80.99% F-Measure and 82.06% accuracy.
- Three-point classification (positive, neutral, negative): 59.85% F-Measure and 64.62% accuracy.

Source: Conversation with Bing, 3/16/2024
(1) Building a Sentiment Corpus of Tweets in Brazilian Portuguese. https://arxiv.org/abs/1712.08917.
(2) 7 Best Portuguese Language Speech Datasets of 2022 | Twine. https://www.twine.net/blog/portuguese-language-speech-datasets/.
(3) A survey and study impact of tweet sentiment analysis via ... - Springer. https://link.springer.com/article/10.1007/s10579-023-09687-8.
(4) Top 25 Twitter Datasets for NLP and Machine Learning | iMerit. https://imerit.net/blog/top-25-twitter-datasets-for-natural-language-processing-and-machine-learning-all-pbm/.
(5) Building a Sentiment Corpus of Tweets in Brazilian Portuguese - arXiv.org. https://arxiv.org/pdf/1712.08917v1.pdf.
(6) undefined. https://doi.org/10.48550/arXiv.1712.08917.

Variants: tweetSentBR

Associated Benchmarks

This dataset is used in 1 benchmark:

Text Generation - Metrics: f1-macro

Recent Benchmark Submissions

No recent benchmark submissions available for this dataset.

Research Papers

No papers with results on this dataset found.

External Links:

Papers with Code Entry

tweetSentBR

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview