Long-Tailed instituition names
LoT-insts contains over 25k classes whose frequencies are naturally long-tail distributed. Its test set from four different subsets: many-, medium-, and few-shot sets, as well as a zero-shot open set. To our best knowledge, this is the first natural language dataset that focuses on this long-tailed and open classification problem.
Variants: Lot-insts
This dataset is used in 2 benchmarks:
Task | Model | Paper | Date |
---|---|---|---|
Text Classification | Character-BERT+RS | Text Classification in the Wild: … | 2023-02-19 |
Text Classification | CD-V1 | Text Classification in the Wild: … | 2023-02-19 |
Text Classification | sCool | Text Classification in the Wild: … | 2023-02-19 |
Text Classification | FastText | Text Classification in the Wild: … | 2023-02-19 |
Text Classification | Naive Bayes | Text Classification in the Wild: … | 2023-02-19 |
Long-tail Learning | Character-BERT+RS | Text Classification in the Wild: … | 2023-02-19 |
Recent papers with results on this dataset: