SST-5

Dataset Information
Introduced
2013
License
Unknown
Homepage

Overview

The SST-5, also known as the Stanford Sentiment Treebank with 5 labels, is a dataset used for sentiment analysis. The SST-5 dataset consists of 11,855 single sentences extracted from movie reviews¹. It includes a total of 215,154 unique phrases from parse trees, each annotated by 3 human judges¹. Each phrase is labeled as either negative, somewhat negative, neutral, somewhat positive, or positive. This is why it's referred to as SST-5 or SST fine-grained.

Variants: SST-5 Fine-grained classification, SST-5

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Few-Shot Text Classification SetFit + OCD OCD: Learning to Overfit with … 2022-10-02

Research Papers

Recent papers with results on this dataset: