SWDE

Structured Web Data Extraction

Dataset Information
License
Unknown
Homepage

Overview

This dataset is a real-world web page collection used for research on the automatic extraction of structured data (e.g., attribute-value pairs of entities) from the Web. We hope it could serve as a useful benchmark for evaluating and comparing different methods for structured web data extraction.

Source: Official

Variants: SWDE

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Attribute Extraction InstrucTE (zero-shot) Schema-Driven Information Extraction from Heterogeneous … 2023-05-23
Attribute Extraction DOM-LM DOM-LM: Learning Generalizable Representations for … 2022-01-25

Research Papers

Recent papers with results on this dataset: