Krapivin

Name: Krapivin
Published: 2008-05-01
License: Unknown

Dataset Information

Introduced

2008

License

Unknown

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

A dataset for benchmarking keyphrase extraction and generation techniques from long document English scientific papers. The dataset has high quality and consists of 2,000 scientific papers from the Computer Science domain published by ACM. Each paper has its keyphrases assigned by the authors and verified by the reviewers. Different parts of papers, such as title and abstract, are separated, enabling extraction based on the part of an article's text. The content of each paper is converted from PDF to plain text. The pieces of formulae, tables, figures and LaTeX mark up were removed automatically. Link: https://huggingface.co/datasets/midas/krapivin

Variants: Krapivin

Associated Benchmarks

This dataset is used in 1 benchmark:

Keyphrase Extraction - Metrics: F1@10

Recent Benchmark Submissions

Task	Model	Paper	Date
Keyphrase Extraction	Attention-Seeker	Attention-Seeker: Dynamic Self-Attention Scoring for …	2024-09-17
Keyphrase Extraction	PromptRank	PromptRank: Unsupervised Keyphrase Extraction Using …	2023-05-08

Research Papers

Recent papers with results on this dataset:

External Links:

Krapivin

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview