NUS

Name: NUS
Published: 2007-12-08
License: Unknown

Dataset Information

Introduced

2007

License

Unknown

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

The dataset was constructed by first finding suitable publications and then collecting keyphrases from manual annotators. Google SOAP API was used to find documents using variants of the query “keywords general terms filetype:pdf”. Over 250 of these PDF documents were downloaded for further processing. Documents were then manually restricted to scientific conference papers, with a length range of 4-12 pages. The PDF documents were then converted to plain text using the PDF995 software suite (as it handled two-columned text better than other programs tried). At the end of this process, 211 documents in plain text format were selected which were converted successfully without problems. The authors then recruited student volunteers from our department to participate in manual keyphrase assignments. Each volunteer was given three PDF files (with author-assigned keyphrases hidden) to assign keyphrases to.

Variants: NUS

Associated Benchmarks

This dataset is used in 1 benchmark:

Keyphrase Extraction - Metrics: F1@10

Recent Benchmark Submissions

Task	Model	Paper	Date
Keyphrase Extraction	PromptRank	PromptRank: Unsupervised Keyphrase Extraction Using …	2023-05-08

Research Papers

Recent papers with results on this dataset:

PromptRank: Unsupervised Keyphrase Extraction Using Prompt (2023) -

External Links:

NUS

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview