Realworld Annotated Few-shot Tasks
The RAFT benchmark (Realworld Annotated Few-shot Tasks) focuses on naturally occurring tasks and uses an evaluation setup that mirrors deployment.
RAFT is a few-shot classification benchmark that tests language models:
Description from: https://raft.elicit.org/
Image source: https://raft.elicit.org/
Variants: RAFT
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Few-Shot Text Classification | T-Few | Few-Shot Parameter-Efficient Fine-Tuning is Better … | 2022-05-11 |
Few-Shot Text Classification | Human (crowdsourced) | RAFT: A Real-World Few-Shot Text … | 2021-09-28 |
Few-Shot Text Classification | GPT-3 | RAFT: A Real-World Few-Shot Text … | 2021-09-28 |
Few-Shot Text Classification | AdaBoost | RAFT: A Real-World Few-Shot Text … | 2021-09-28 |
Few-Shot Text Classification | GPT-Neo | RAFT: A Real-World Few-Shot Text … | 2021-09-28 |
Few-Shot Text Classification | GPT-2 | RAFT: A Real-World Few-Shot Text … | 2021-09-28 |
Few-Shot Text Classification | BART MNLI zero-shot | RAFT: A Real-World Few-Shot Text … | 2021-09-28 |
Few-Shot Text Classification | Plurality-class | RAFT: A Real-World Few-Shot Text … | 2021-09-28 |
Few-Shot Text Classification | GPT-3 zero-shot | RAFT: A Real-World Few-Shot Text … | 2021-09-28 |
Recent papers with results on this dataset: