NCT-CRC-HE-100K

Dataset Information
Modalities
Images
Introduced
2018
License
Unknown
Homepage

Overview

The NCT-CRC-HE-100K dataset is a set of 100,000 non-overlapping image patches extracted from 86 H$\&$E stained human cancer tissue slides and normal tissue from the NCT biobank (National Center for Tumor Diseases) and the UMM pathology archive (University Medical Center Mannheim). While the dataset Colorectal Cacner-Validation-Histology-7K (CRC-VAL-HE-7K) consist of 7180 images extracted from 50 patients with colorectal adenocarcinoma and were used to create a dataset that does not overlap with patients in the NCT-CRC-HE-100K dataset. It was created by pathologists by manually delineating tissue regions in whole slide images into the following nine tissue classes: Adipose (ADI), background (BACK), debris (DEB), lymphocytes (LYM), mucus (MUC), smooth muscle (MUS), normal colon mucosa (NORM), cancer-associated stroma (STR), colorectal adenocarcinoma epithelium (TUM).

Image source: https://www.cs.unc.edu/~mn/sites/default/files/macenko2009.pdf

Variants: NCT-CRC-HE-100K

Associated Benchmarks

This dataset is used in 2 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Image Classification SAG-ViT SAG-ViT: A Scale-Aware, High-Fidelity Patching … 2024-11-14
Medical Image Classification ResNeXt-50-32x4d ResNet strikes back: An improved … 2021-10-01
Medical Image Classification RegNetY-3.2GF RegNet: Self-Regulated Network for Image … 2021-01-03
Medical Image Classification Efficientnet-b0 EfficientNet: Rethinking Model Scaling for … 2019-05-28
Medical Image Classification Res2Net-50 Res2Net: A New Multi-scale Backbone … 2019-04-02
Medical Image Classification DenseNet-169 Densely Connected Convolutional Networks 2016-08-25
Medical Image Classification ResNet-18 Deep Residual Learning for Image … 2015-12-10
Medical Image Classification ResNet-50 Deep Residual Learning for Image … 2015-12-10

Research Papers

Recent papers with results on this dataset: