The Abt-Buy dataset for entity resolution derives from the online retailers Abt.com and Buy.com. The dataset contains 1081 entities from abt.com and 1092 entities from buy.com as well as a gold standard (perfect mapping) with 1097 matching record pairs between the two data sources. The common attributes between the two data sources are: product name, product description and product price.
The dataset was initially published in the repository of the Database Group of the University of Leipzig:
https://dbs.uni-leipzig.de/research/projects/object_matching/benchmark_datasets_for_entity_resolution
To enable the reproducibility of the results and the comparability of the performance of different matchers on the Abt-Buy matching task, the dataset was split into fixed train, validation and test sets.
The fixed splits are provided in the CompERBench repository:
http://data.dws.informatik.uni-mannheim.de/benchmarkmatchingtasks/index.html
Variants: Abt-Buy
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Entity Resolution | Meta-Llama-3.1-8B-Instruct_fine_tuned | Fine-tuning Large Language Models for … | 2024-09-12 |
Entity Resolution | Meta-Llama-3.1-8B-Instruct | Fine-tuning Large Language Models for … | 2024-09-12 |
Entity Resolution | gpt-4o-mini-2024-07-18_fine_tuned | Fine-tuning Large Language Models for … | 2024-09-12 |
Entity Resolution | gpt-4o-2024-08-06 | Fine-tuning Large Language Models for … | 2024-09-12 |
Entity Resolution | Meta-Llama-3.1-70B-Instruct | Fine-tuning Large Language Models for … | 2024-09-12 |
Entity Resolution | gpt-4o-mini-2024-07-18 | Fine-tuning Large Language Models for … | 2024-09-12 |
Entity Resolution | gpt4-0613_zeroshot | Entity Matching using Large Language … | 2023-10-17 |
Entity Resolution | RoBERTa-SupCon | Supervised Contrastive Learning for Product … | 2022-02-04 |
Entity Resolution | Ditto | Deep Entity Matching with Pre-Trained … | 2020-04-01 |
Recent papers with results on this dataset: