MHIST

Minimalist Histopathology image analysis dataset

Dataset Information
Modalities
Images, Biology
Introduced
2021
License
Unknown
Homepage

Overview

The minimalist histopathology image analysis dataset (MHIST) is a binary classification dataset of 3,152 fixed-size images of colorectal polyps, each with a gold-standard label determined by the majority vote of seven board-certified gastrointestinal pathologists. MHIST also includes each image’s annotator agreement level. As a minimalist dataset, MHIST occupies less than 400 MB of disk space, and a ResNet-18 baseline can be trained to convergence on MHIST in just 6 minutes using approximately 3.5 GB of memory on a NVIDIA RTX 3090. As example use cases, the authors use MHIST to study natural questions that arise in histopathology image classification such as how dataset size, network depth, transfer learning, and high-disagreement examples affect model performance.

Source: Wei et al.

Image source: Wei et al.

Variants: MHIST

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Classification MoCo-v2 (ResNet-50) Benchmarking Self-Supervised Learning on Diverse … 2022-12-09
Classification Supervised (ViT-S/16) Benchmarking Self-Supervised Learning on Diverse … 2022-12-09
Classification Barlow Rwins (ResNet-50) Benchmarking Self-Supervised Learning on Diverse … 2022-12-09
Classification DINO (ViT-S/16) Benchmarking Self-Supervised Learning on Diverse … 2022-12-09
Classification Supervised (ResNet-50) Benchmarking Self-Supervised Learning on Diverse … 2022-12-09
Classification SwAV (ResNet-50) Benchmarking Self-Supervised Learning on Diverse … 2022-12-09

Research Papers

Recent papers with results on this dataset: