Machine Learning Benchmarks

Browse 343 benchmarks across 67 tasks
← ML Research Wiki / Benchmarks / Medical
Clear
Browse by Category

1 Image, 2*2 Stitchi

FQL-Driving

FQL-driving

📊 1 results
📏 Metrics: 0..5sec

10-shot image generation

FQL-Driving

FQL-driving

📊 1 results
📏 Metrics: 0-shot MRR

FlyingThings3D

FlyingThings3D is a synthetic dataset for optical flow, disparity and scene flow estimation. It consists of everyday objects flying along …

📊 1 results
📏 Metrics: 0..5sec

MEAD

Multi-view Emotional Audio-visual Dataset

📊 1 results
📏 Metrics: 12k

Music21

Music21 is an untrimmed video dataset crawled by keyword query from Youtube. It contains music performances belonging to 21 categories. …

📊 1 results
📏 Metrics: 0..5sec

16k

ConceptNet

ConceptNet is a knowledge graph that connects words and phrases of natural language with labeled edges. Its knowledge is collected …

📊 1 results
📏 Metrics: 1'"

3D Classification

U-10: United-10 COVID19 CT Dataset

This dataset supports the research detailed in the pre-print "Virtual Imaging Trials Improved the Transparency and Reliability of AI Systems …

📊 2 results
📏 Metrics: AUC

3D Face Modelling

Voxceleb-3D

A dataset for voice and 3D face structure study. It contains about 1.4K identities with their 3D face models and …

📊 2 results
📏 Metrics: Mean ARE, ARE-ER, ARE-FR, ARE-MR, ARE-CR

Anxiety Detection

Well-being Dataset

The dataset is a private dataset collected for automatic analysis of psychological distress. It contains self-reported distress labels provided by …

📊 1 results
📏 Metrics: F1-score

Automatic Sleep Stage Classification

ISRUC-Sleep

ISRUC-Sleep is a polysomnographic (PSG) dataset. The data were obtained from human adults, including healthy subjects, and subjects with sleep …

📊 1 results
📏 Metrics: AUROC, Accuracy, Kappa

Sleep-EDF

The sleep-edf database contains 197 whole-night PolySomnoGraphic sleep recordings, containing EEG, EOG, chin EMG, and event markers. Some records also …

📊 3 results
📏 Metrics: Accuracy, Cohen’s Kappa score, Number of parameters (M)

Blood pressure estimation

MIMIC-III

The Medical Information Mart for Intensive Care III (MIMIC-III) dataset is a large, de-identified and publicly-available collection of medical records. …

📊 1 results
📏 Metrics: MAE for SBP [mmHg], MAE for DBP [mmHg], Mean Squared Error, MAE

Brain Decoding

BCI Competition IV: ECoG to Finger Movements

Prediction of Finger Flexion IV Brain-Computer Interface Data Competition The goal of this dataset is to predict the flexion of …
📊 1 results
📏 Metrics: Pearson Correlation

Stanford ECoG library: ECoG to Finger Movements

Electrophysiological data from implanted electrodes in the human brain are rare, and therefore scientific access to it has remained somewhat …

📊 1 results
📏 Metrics: Pearson Correlation

Breast Tumour Classification

PCam

PatchCamelyon is an image classification dataset. It consists of 327.680 color images (96 x 96px) extracted from histopathologic scans of …

📊 16 results
📏 Metrics: AUC, Accuracy

COVID-19 Diagnosis

COVIDGR

Under a close collaboration with an expert radiologist team of the Hospital Universitario San Cecilio, the COVIDGR-1.0 dataset of patients' …

📊 1 results
📏 Metrics: Accuracy

COVIDx CXR-3

COVIDx CXR-3 is an open access benchmark dataset that we generated, comprising 30,882 CXR images across 17,026 patient cases. Images …

📊 7 results
📏 Metrics: Per-Class Accuracy

Large COVID-19 CT scan slice dataset

"We built a large lung CT scan dataset for COVID-19 by curating data from 7 public datasets listed in the …

📊 1 results
📏 Metrics: AUC-ROC, Accuracy, Macro F1, Macro Precision, Macro Recall, Micro Precision, Specificity

Circulatory Failure

HiRID

HiRID is a freely accessible critical care dataset containing data relating to almost 34 thousand patient admissions to the Department …

📊 8 results
📏 Metrics: AUPRC, Recall@50

Classification

Adult

Data Set Information: Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records …

📊 1 results
📏 Metrics: AUROC

BIOSCAN_1M_Insect Dataset

In an effort to catalog insect biodiversity, we propose a new large dataset of hand-labelled insect images, the BIOSCAN-1M Insect …

📊 2 results
📏 Metrics: Macro F1

BiasBios

The purpose of this dataset was to study gender bias in occupations. Online biographies, written in English, were collected to …

📊 1 results
📏 Metrics: 1:1 Accuracy

BoolQ

BoolQ is a question answering dataset for yes/no questions containing 15942 examples. These questions are naturally occurring – they are …

📊 2 results
📏 Metrics: Test Accuracy

Brain Tumor MRI Dataset

This dataset is a combination of the following three datasets : figshare, SARTAJ dataset and Br35H This dataset contains 7022 …

📊 1 results
📏 Metrics: F1 score

CIFAKE: Real and AI-Generated Synthetic Images

The quality of AI-generated images has rapidly increased, leading to concerns of authenticity and trustworthiness. CIFAKE is a dataset that …

📊 1 results
📏 Metrics: Validation Accuracy

CIFAR-100

The CIFAR-100 dataset (Canadian Institute for Advanced Research, 100 classes) is a subset of the Tiny Images dataset and consists …

📊 1 results
📏 Metrics: Accuracy

CIFAR-10C

Common corruptions dataset for CIFAR10

📊 1 results
📏 Metrics: Accuracy on Brightness Corrupted Images

COVID-19 Image Data Collection

Contains hundreds of frontal view X-rays and is the largest public resource for COVID-19 image and prognostic data, making it …

📊 1 results
📏 Metrics: Accuracy

CWRU Bearing Dataset

Data was collected for normal bearings, single-point drive end and fan end defects. Data was collected at 12,000 samples/second and …

📊 1 results
📏 Metrics: 10 fold Cross validation

Chest X-Ray Images (Pneumonia)

The normal chest X-ray (left panel) depicts clear lungs without any areas of abnormal opacification in the image. Bacterial pneumonia …

📊 1 results
📏 Metrics: Accuracy

ForgeryNet

We construct the ForgeryNet dataset, an extremely large face forgery dataset with unified annotations in image- and video-level data across …

📊 3 results
📏 Metrics: AUC, Accuracy

Full-body Parkinson’s disease dataset

A public data set of walking full-body kinematics and kinetics in individuals with Parkinson’s disease

📊 7 results
📏 Metrics: F1-score (weighted)

HOWS

HOWS-CL-25 (Household Objects Within Simulation dataset for Continual Learning) is a synthetic dataset especially designed for object classification on mobile …

📊 1 results
📏 Metrics: Overall accuracy after last sequence

HRF

The HRF dataset is a dataset for retinal vessel segmentation which comprises 45 images and is organized as 15 subsets. …

📊 1 results
📏 Metrics: Accuracy

IRFL: Image Recognition of Figurative Language

The IRFL dataset consists of idioms, similes, and metaphors with matching figurative and literal images, as well as two novel …

📊 1 results
📏 Metrics: 1-of-100 Accuracy

ISIC 2019

The goal for ISIC 2019 is classify dermoscopic images among nine different diagnostic categories.25,331 images are available for training across …

📊 1 results
📏 Metrics: Balanced Multi-Class Accuracy

ImageNet C-OOD (class-out-of-distribution)

This dataset was presented as part of the ICLR 2023 paper 𝘈 𝘧𝘳𝘢𝘮𝘦𝘸𝘰𝘳𝘬 𝘧𝘰𝘳 𝘣𝘦𝘯𝘤𝘩𝘮𝘢𝘳𝘬𝘪𝘯𝘨 𝘊𝘭𝘢𝘴𝘴-𝘰𝘶𝘵-𝘰𝘧-𝘥𝘪𝘴𝘵𝘳𝘪𝘣𝘶𝘵𝘪𝘰𝘯 𝘥𝘦𝘵𝘦𝘤𝘵𝘪𝘰𝘯 𝘢𝘯𝘥 𝘪𝘵𝘴 𝘢𝘱𝘱𝘭𝘪𝘤𝘢𝘵𝘪𝘰𝘯 …

📊 5 results
📏 Metrics: Detection AUROC (severity 0), Detection AUROC (severity 5), Detection AUROC (severity 10)

InDL

Dataset Introduction In this work, we introduce the In-Diagram Logic (InDL) dataset, an innovative resource crafted to rigorously evaluate the …

📊 9 results
📏 Metrics: Average Recall

LES-AV

This data set comprises 22 fundus images with their corresponding manual annotations for the blood vessels, separated as arteries and …

📊 1 results
📏 Metrics: Accuracy

Liver-US

The Liver-US dataset is a comprehensive collection of high-quality ultrasound images of the liver, including both normal and abnormal cases. …

📊 1 results
📏 Metrics: AUC

MHIST

The minimalist histopathology image analysis dataset (MHIST) is a binary classification dataset of 3,152 fixed-size images of colorectal polyps, each …

📊 6 results
📏 Metrics: Accuracy

MedSecId

The process by which sections in a document are demarcated and labeled is known as section identification. Such sections are …

📊 1 results
📏 Metrics: 1 shot Micro-F1

MixedWM38

MixedWM38 Dataset(WaferMap) has more than 38000 wafer maps, including 1 normal pattern, 8 single defect patterns, and 29 mixed defect …

📊 1 results
📏 Metrics: Accuracy, MCC

MuReD Dataset

Early detection of retinal diseases is one of the most important means of preventing partial or permanent blindness in patients. …

📊 1 results
📏 Metrics: ML F1, ML mAP, ML AUC

N-CARS

A large real-world event-based dataset for object classification. Source: HATS: Histograms of Averaged Time Surfaces for Robust Event-based Object Classification

📊 6 results
📏 Metrics: Accuracy (%), Architecture, Representation, Representation Time( ms / 100ms events), Inference Time, Params (M)

N-ImageNet

The N-ImageNet dataset is an event-camera counterpart for the ImageNet dataset. The dataset is obtained by moving an event camera …

📊 9 results
📏 Metrics: Accuracy (%)

RITE

The RITE (Retinal Images vessel Tree Extraction) is a database that enables comparative studies on segmentation or classification of arteries …

📊 1 results
📏 Metrics: Accuracy

RSSCN7

he RSSCN7 dataset contains satellite images acquired from Google Earth, which is originally collected for remote sensing scene classification. We …

📊 1 results
📏 Metrics: 1:1 Accuracy

RTE

The Recognizing Textual Entailment (RTE) datasets come from a series of textual entailment challenges. Data from RTE1, RTE2, RTE3 and …

📊 2 results
📏 Metrics: Test Accuracy

SGD

The Schema-Guided Dialogue (SGD) dataset consists of over 20k annotated multi-domain, task-oriented conversations between a human and a virtual assistant. …

📊 1 results
📏 Metrics: F1 (Seqeval)

SHD - Adding

This dataset is based on the Spiking Heidelberg Digits (SHD) dataset. Sample inputs consist of two spike encoded digits sampled …

📊 3 results
📏 Metrics: Accuracy (%)

SPOT-10

The SPOTS-10 dataset is an extensive collection of grayscale images showcasing diverse patterns found in ten animal species. Specifically, SPOTS-10 …

📊 9 results
📏 Metrics: Accuracy

SST-2

The Stanford Sentiment Treebank is a corpus with fully labeled parse trees that allows for a complete analysis of the …

📊 2 results
📏 Metrics: Test Accuracy

Sentiment140

Sentiment140 is a dataset that allows you to discover the sentiment of a brand, product, or topic on Twitter. Source: …

📊 1 results
📏 Metrics: Accuracy

SimGas

This dataset consists of computer-generated images for gas leakage segmentation. It features diverse backgrounds, interfering foreground objects, and precise ground …

📊 1 results
📏 Metrics: Frame Level Accuracy

Sound-based drone fault classification using multitask learning

arxiv : https://arxiv.org/abs/2304.11708 Accepted at 29th International Congress on Sound and Vibration (ICSV29). The drone has been used for various …

📊 1 results
📏 Metrics: macro f1 score (A(100), B(100), C(100) Avg.)

TACM12K

Table-ACM12K (TACM12K) is a relational table dataset derived from the ACM heterogeneous graph dataset. It includes four tables: papers, authors, …

📊 1 results
📏 Metrics: Accuracy

TCGA

📊 1 results
📏 Metrics: AUPRC, AUROC

TLF2K

Table-LastFm2K (TLF2K) is a relational table dataset derived from the classical LastFM2K dataset. It contains three tables: artists, user_artists, and …

📊 1 results
📏 Metrics: Accuracy

TML1M

Table-MovieLens1M (TML1M) is a relational table dataset derived from the classical MovieLens1M dataset. It consists of three tables: users, movies, …

📊 1 results
📏 Metrics: Accuracy

WSC

The Winograd Schema Challenge was introduced both as an alternative to the Turing Test and as a test of a …

📊 2 results
📏 Metrics: Test Accuracy

WiC

WiC is a benchmark for the evaluation of context-sensitive word embeddings. WiC is framed as a binary classification task. Each …

📊 2 results
📏 Metrics: Test Accuracy

XImageNet-12

Enlarge the dataset to understand how image background effect the Computer Vision ML model. With the following topics: Blur Background …

📊 3 results
📏 Metrics: Robustness Score

Clinical Concept Extraction

2010 i2b2/VA

2010 i2b2/VA is a biomedical dataset for relation classification and entity typing.

📊 3 results
📏 Metrics: Exact Span F1

Colorectal Gland Segmentation:

STARE

The STARE (Structured Analysis of the Retina) dataset is a dataset for retinal vessel segmentation. It contains 20 equal-sized (700×605) …

📊 3 results
📏 Metrics: AUC

Core Promoter Detection

GUE

A collection of $28$ datasets across $7$ tasks constructed for genome language model evaluation. Contains seven tasks: promoter prediction. core …

📊 1 results
📏 Metrics: MCC

Covid Variant Prediction

GUE

A collection of $28$ datasets across $7$ tasks constructed for genome language model evaluation. Contains seven tasks: promoter prediction. core …

📊 1 results
📏 Metrics: Avg F1

Diabetes Prediction

Diabetes

What do the instances in this dataset represent? The instances represent hospitalized patient records diagnosed with diabetes. **Are there recommended …

📊 1 results
📏 Metrics: Accuracy

Document Text Classification

Tobacco-3482

The Tobacco-3482 dataset consists of document images belonging to 10 classes such as letter, form, email, resume, memo, etc. The …

📊 3 results
📏 Metrics: Accuracy, Training time (hours)

Drug Discovery

DAVIS-DTA

Dataset Description: The interaction of 72 kinase inhibitors with 442 kinases covering >80% of the human catalytic protein kinome. Task …

📊 3 results
📏 Metrics: CI, MSE

KIBA

Dataset Description: Toward making use of the complementary information captured by the various bioactivity types, including IC50, K(i), and K(d), …

📊 3 results
📏 Metrics: CI, MSE

LIT-PCBA(ALDH1)

Comparative evaluation of virtual screening methods requires a rigorous benchmarking procedure on diverse, realistic, and unbiased data sets. Recent investigations …

📊 1 results
📏 Metrics: AUC

LIT-PCBA(KAT2A)

Comparative evaluation of virtual screening methods requires a rigorous benchmarking procedure on diverse, realistic, and unbiased data sets. Recent investigations …

📊 1 results
📏 Metrics: AUC

LIT-PCBA(MAPK1)

Comparative evaluation of virtual screening methods requires a rigorous benchmarking procedure on diverse, realistic, and unbiased data sets. Recent investigations …

📊 1 results
📏 Metrics: AUC

MUV

The Maximum Unbiased Validation (MUV) dataset is a benchmark dataset selected from PubChem BioAssay. It was created by applying a …

📊 4 results
📏 Metrics: AUC

PCBA

PCBA dataset 11 is a collection of high-quality dose-response data, formulated as a multitask learning benchmark from 128 high-throughput screening …

📊 2 results
📏 Metrics: AUC

QED

QED is a linguistically principled framework for explanations in question answering. Given a question and a passage, QED represents an …

📊 1 results
📏 Metrics: Diversity, Success

QM9

QM9 provides quantum chemical properties (at DFT level) for a relevant, consistent, and comprehensive chemical space of small organic molecules. …

📊 10 results
📏 Metrics: Error ratio

SIDER

SIDER contains information on marketed medicines and their recorded adverse drug reactions. The information is extracted from public documents and …

📊 4 results
📏 Metrics: AUC

Tox21

The Tox21 data set comprises 12,060 training samples and 647 test samples that represent chemical compounds. There are 801 "dense …

📊 10 results
📏 Metrics: AUC

clintox

The ClinTox dataset compares drugs approved by the FDA and drugs that have failed clinical trials for toxicity reasons. The …

📊 3 results
📏 Metrics: AUC

ECG Classification

PTB-XL

Electrocardiography (ECG) is a key diagnostic tool to assess the cardiac condition of a patient. Automatic ECG interpretation algorithms as …

📊 1 results
📏 Metrics: AUROC

UCR Time Series Classification Archive

The UCR Time Series Archive - introduced in 2002, has become an important resource in the time series data mining …

📊 1 results
📏 Metrics: Accuracy (Test)

Eeg Decoding

CWL EEG/fMRI Dataset

EEG/fMRI Data from 8 subject doing a simple eyes open/eyes closed task is provided on this webpage. The EEG/fMRI data …

📊 1 results
📏 Metrics: Pearson Correlation

Epigenetic Marks Prediction

GUE

A collection of $28$ datasets across $7$ tasks constructed for genome language model evaluation. Contains seven tasks: promoter prediction. core …

📊 1 results
📏 Metrics: MCC

Epilepsy Prediction

Epilepsy seizure prediction

The original dataset from the reference consists of 5 different folders, each with 100 files, with each file representing a …

📊 1 results
📏 Metrics: 1:1 Accuracy

Fovea Detection

ADAM

ADAM is organized as a half day Challenge, a Satellite Event of the ISBI 2020 conference in Iowa City, Iowa, …

📊 1 results
📏 Metrics: Euclidean Distance (ED)

IDRiD

Indian Diabetic Retinopathy Image Dataset (IDRiD) dataset consists of typical diabetic retinopathy lesions and normal retinal structures annotated at a …

📊 1 results
📏 Metrics: Euclidean Distance (ED)

Image Denoising

DND

Benchmarking Denoising Algorithms with Real Photographs This dataset consists of 50 pairs of noisy and (nearly) noise-free images captured with …

📊 15 results
📏 Metrics: PSNR (sRGB), SSIM (sRGB)

FFHQ

Flickr-Faces-HQ (FFHQ) consists of 70,000 high-quality PNG images at 1024×1024 resolution and contains considerable variation in terms of age, ethnicity …

📊 1 results
📏 Metrics: LPIPS

FMD

The Fluorescence Microscopy Denoising (FMD) dataset is dedicated to Poisson-Gaussian denoising. The dataset consists of 12,000 real fluorescence microscopy images …

📊 1 results
📏 Metrics: PSNR

Nam

A holistic approach to cross-channel image noise modeling and its application to image denoising

📊 1 results
📏 Metrics: PSNR, SSIM

PolyU

PolyU Dataset is a large dataset of real-world noisy images with reasonably obtained corresponding “ground truth” images. The basic idea …

📊 1 results
📏 Metrics: PSNR, SSIM

SIDD

SIDD is an image denoising dataset containing 30,000 noisy images from 10 scenes under different lighting conditions using five representative …

📊 20 results
📏 Metrics: PSNR (sRGB), SSIM (sRGB), Average PSNR

Image Generation

ARKitScenes

ARKitScenes is an RGB-D dataset captured with the widely available Apple LiDAR scanner. Along with the per-frame raw data (Wide …

📊 4 results
📏 Metrics: FID, FID (SwAV)

Binarized MNIST

A binarized version of MNIST. Source: Binarized MNIST

📊 10 results
📏 Metrics: nats, bits/dimension

CIFAR-10

The CIFAR-10 database (Canadian Institute For Advanced Research database) is a large collection of natural color images. It has a …

📊 72 results
📏 Metrics: FID, IS, NFE

CIFAR-100

The CIFAR-100 dataset (Canadian Institute for Advanced Research, 100 classes) is a subset of the Tiny Images dataset and consists …

📊 6 results
📏 Metrics: FID, Inception Score, Model Size (MB)

CLEVR

CLEVR (Compositional Language and Elementary Visual Reasoning) is a synthetic Visual Question Answering dataset. It contains images of 3D-rendered objects; …

📊 6 results
📏 Metrics: FID-5k-training-steps

CelebA

CelebFaces Attributes dataset contains 202,599 face images of the size 178×218 from 10,177 celebrities, each annotated with 40 binary labels …

📊 1 results
📏 Metrics: bpd (8-bits)

CelebA-HQ

The CelebA-HQ dataset is a high-quality version of CelebA that consists of 30,000 images at 1024×1024 resolution. Source: [IntroVAE: Introspective …

📊 1 results
📏 Metrics: FLD

Cityscapes

Cityscapes is a large-scale database which focuses on semantic understanding of urban street scenes. It provides semantic, instance-wise, and dense …

📊 6 results
📏 Metrics: FID-10k-training-steps

FFHQ

Flickr-Faces-HQ (FFHQ) consists of 70,000 high-quality PNG images at 1024×1024 resolution and contains considerable variation in terms of age, ethnicity …

📊 12 results
📏 Metrics: FID, Clean-FID (70k), FID-10k-training-steps

Fashion-MNIST

Fashion-MNIST is a dataset comprising of 28×28 grayscale images of 70,000 fashion products from 10 categories, with 7,000 images per …

📊 5 results
📏 Metrics: FID, Precision, Recall

KMNIST

📊 1 results
📏 Metrics: FID

LLVIP

  • Visible-infrared Paired Dataset for Low-light Vision * 30976 images (15488 pairs) * 24 dark scenes, 2 daytime scenes * …
📊 1 results
📏 Metrics: PSNR, SSIM

LSUN

The Large-scale Scene Understanding (LSUN) challenge aims to provide a different benchmark for large-scale scene classification and understanding. The LSUN …

📊 1 results
📏 Metrics: Average FID

MNIST

The MNIST database (Modified National Institute of Standards and Technology database) is a large collection of handwritten digits. It has …

📊 11 results
📏 Metrics: bits/dimension, FID, Precision, Recall, PSNR, SSIM

MetFaces

MetFaces is an image dataset of human faces extracted from works of art. The dataset consists of 1336 high-quality PNG …

📊 3 results
📏 Metrics: MAE Signature, MAE log-signature, RMSE Signature, RMSE log-signature

Multi-dSprites

📊 1 results
📏 Metrics: FID

NASA Perseverance

Samples from NASA Perseverance and set of GAN generated synthetic images from Neural Mars.

📊 1 results
📏 Metrics: MAE Signature, MAE log-signature, RMSE Signature, RMSE log-signature

ObjectsRoom

The ObjectsRoom dataset is based on the MuJoCo environment used by the Generative Query Network [4] and is a multi-object …

📊 3 results
📏 Metrics: FID

RC-49

RC-49 is a benchmark dataset for generating images conditional on a continuous scalar variable. It is made by rendering 49 …

📊 2 results
📏 Metrics: Intra-FID

Replica

The Replica Dataset is a dataset of high quality reconstructions of a variety of indoor spaces. Each reconstruction has clean …

📊 4 results
📏 Metrics: FID, FID (SwAV)

SDSS Galaxies

This is a dataset of 306,006 galaxies whose coordinates are taken from the Sloan Digital Sky Survey Data Release 7 …

📊 1 results
📏 Metrics: FID

STL-10

The STL-10 is an image dataset derived from ImageNet and popularly used to evaluate algorithms of unsupervised feature learning or …

📊 25 results
📏 Metrics: FID, Inception score, Model Size (MB), Recall, NFE

ShapeStacks

A simulation-based dataset featuring 20,000 stack configurations composed of a variety of elementary geometric primitives richly annotated regarding semantics and …

📊 3 results
📏 Metrics: FID

Stacked MNIST

The Stacked MNIST dataset is derived from the standard MNIST dataset with an increased number of discrete modes. 240,000 RGB …

📊 2 results
📏 Metrics: FID, Inception score

Stanford Cars

The Stanford Cars dataset consists of 196 classes of cars with a total of 16,185 images, taken from the rear. …

📊 4 results
📏 Metrics: FID, Inception score

Stanford Dogs

The Stanford Dogs dataset contains 20,580 images of 120 classes of dogs from around the world, which are divided into …

📊 4 results
📏 Metrics: FID, Inception score

TextAtlasEval

A Dense-text Image Benchmark to evaluate large generation model's ability on text generation.

📊 4 results
📏 Metrics: TextVsionBlend OCR (F1 Score), TextVisionBlend OCR (Accuracy), TextVisionBlend OCR (Cer), TextVisionBlend FID, TextVisionBlend Clip Score, StyledTextSynth OCR (F1 Score), StyledTextSynth OCR (Accuracy), StyledTextSynth OCR (Cer), StyledTextSynth FID, StyledTextSynth Clip Score, TextScenesHQ OCR (F1 Score), TextScenesHQ OCR (Accuracy), TextScenesHQ OCR (Cer), TextScenesHQ FID, TextScenesHQ Clip Score

VLN-CE

Vision and Language Navigation in Continuous Environments (VLN-CE) is an instruction-guided navigation task with crowdsourced instructions, realistic environments, and unconstrained …

📊 4 results
📏 Metrics: FID, FID (SwAV)

VizDoom

ViZDoom is an AI research platform based on the classical First Person Shooter game Doom. The most popular game mode …

📊 4 results
📏 Metrics: FID, FID (SwAV)

WISE

WISE, the first benchmark specifically designed for World Knowledge-Informed Semantic Evaluation. WISE moves beyond simple word-pixel mapping by challenging models …

📊 13 results
📏 Metrics: Overall, Cultural, Time, Space, Biology, Physics, Chemistry

Image Reconstruction

ImageNet

The ImageNet dataset contains 14,197,122 annotated images according to the WordNet hierarchy. Since 2010 the dataset is used in the …

📊 15 results
📏 Metrics: FID, LPIPS, PSNR, SSIM

Spike-X4K

Overview The Spike-X4K Dataset is a high-resolution image reconstruction resource tailored for the latest advancements in spike camera technology. …

📊 1 results
📏 Metrics: Average PSNR

Ultra-High Resolution Image Reconstruction Benchmark

Ultra-high definition benchmark (UHDBench) includes 2293 images at 2k resolution sourced from the ground-truth test sets of HRSOD, LIU4k, UAVid, …

📊 6 results
📏 Metrics: rFID, PSNR, SSIM, LPIPS

Information Extraction

SemTabNet

Dataset Card for SemTabNet This dataset accompanies the following paper: ``` Title: Statements: Universal Information Extraction from Tables with …

📊 1 results
📏 Metrics: average Tree Similarity Score

Kidney Function

HiRID

HiRID is a freely accessible critical care dataset containing data relating to almost 34 thousand patient admissions to the Department …

📊 6 results
📏 Metrics: MAE

Language Modelling

2000 HUB5 English

2000 HUB5 English Evaluation Transcripts was developed by the Linguistic Data Consortium (LDC) and consists of transcripts of 40 English …

📊 1 results
📏 Metrics: 10-stage average accuracy

Arxiv HEP-TH citation graph

Arxiv HEP-TH (high energy physics theory) citation graph is from the e-print arXiv and covers all the citations within a …

📊 1 results
📏 Metrics: BPB

Books3

The Books3 dataset emerged as part of a broader effort to train AI models for natural language understanding and generation. …

📊 1 results
📏 Metrics: BPB

C4

C4 is a colossal, cleaned version of Common Crawl's web crawl corpus. It was based on Common Crawl dataset: https://commoncrawl.org. …

📊 9 results
📏 Metrics: Perplexity, TPUv3 Hours, Steps

Curation Corpus

The Curation Corpus is a collection of 40,000 professionally-written summaries of news articles, with links to the articles themselves. Source: …

📊 1 results
📏 Metrics: BPB

FreeLaw

Free Law Project is a leading nonprofit organization that aims to make the legal ecosystem more equitable and competitive through …

📊 1 results
📏 Metrics: BPB

Hutter Prize

The Hutter Prize Wikipedia dataset, also known as enwiki8, is a byte-level dataset consisting of the first 100 million bytes …

📊 18 results
📏 Metrics: Bit per Character (BPC), Number of params

LAMBADA

The LAMBADA (LAnguage Modeling Broadened to Account for Discourse Aspects) benchmark is an open-ended cloze task which consists of about …

📊 34 results
📏 Metrics: Accuracy, Perplexity

OpenWebText

OpenWebText is an open-source recreation of the WebText corpus. The text is web content extracted from URLs shared on Reddit …

📊 12 results
📏 Metrics: eval_perplexity, eval_loss, parameters

PhilPapers

PhilPapers is a remarkable resource for the philosophical community. Let me break it down for you: 1. PhilPapers: It's an …

📊 1 results
📏 Metrics: BPB

PubMed Cognitive Control Abstracts

A collection of 385,705 scientific abstracts about Cognitive Control and their GPT-3 embeddings.

📊 1 results
📏 Metrics: BPB

SALMon

The SALMon dataset and benchmark was introduced in the paper "A Suite for Acoustic Language Model Evaluation", with the goal …

📊 8 results
📏 Metrics: Sentiment Consistency, Speaker Consistency, Gender Consistency, Background (Domain) Consistency, Background (Random) Consistency, Room Consistency, Sentiment Alignment, Background Alignment

Text8

📊 22 results
📏 Metrics: Bit per Character (BPC), Number of params

The Pile

The Pile is a 825 GiB diverse, open source language modelling data set that consists of 22 smaller, high-quality datasets …

📊 39 results
📏 Metrics: Bits per byte, Test perplexity

VietMed

We introduced a Vietnamese speech recognition dataset in the medical domain comprising 16h of labeled medical speech, 1000h of unlabeled …

📊 2 results
📏 Metrics: PPL

Wiki-40B

A new multilingual language model benchmark that is composed of 40+ languages spanning several scripts and linguistic families containing round …

📊 3 results
📏 Metrics: Perplexity

WikiText-103

The WikiText language modeling dataset is a collection of over 100 million tokens extracted from the set of verified Good …

📊 83 results
📏 Metrics: Test perplexity, Validation perplexity, Number of params

WikiText-2

The WikiText language modeling dataset is a collection of over 100 million tokens extracted from the set of verified Good …

📊 34 results
📏 Metrics: Test perplexity, Validation perplexity, Number of params

language-modeling-recommendation

This is the Big-Bench version of our language-based movie recommendation dataset https://github.com/google/BIG-bench/tree/main/bigbench/benchmark_tasks/movie_recommendation GPT-2 has a 48.8% accuracy, chance is 25%.

📊 1 results
📏 Metrics: 1:1 Accuracy

Length-of-Stay prediction

Clinical Admission Notes from MIMIC-III

This dataset is created from MIMIC-III (Medical Information Mart for Intensive Care III) and contains simulated patient admission notes. The …

📊 3 results
📏 Metrics: AUROC

MIMIC-III

The Medical Information Mart for Intensive Care III (MIMIC-III) dataset is a large, de-identified and publicly-available collection of medical records. …

📊 5 results
📏 Metrics: Accuracy (LOS>3 Days), Accuracy (LOS>7 Days)

Lung Nodule Classification

LIDC-IDRI

The LIDC-IDRI dataset contains lesion annotations from four experienced thoracic radiologists. LIDC-IDRI contains 1,018 low-dose lung CTs from 1010 lung …

📊 7 results
📏 Metrics: Accuracy, Acc, AUC, Accuracy(10-fold), Recall/ Sensitivity, Precision, F1 Score

Medical Code Prediction

MIMIC-III

The Medical Information Mart for Intensive Care III (MIMIC-III) dataset is a large, de-identified and publicly-available collection of medical records. …

📊 15 results
📏 Metrics: Micro-F1, Macro-F1, Micro-AUC, Macro-AUC, Precision@5, Precision@8, Precision@15, mAP

MIMIC-IV ICD-10

MIMIC-IV ICD-10 contains 122,279 discharge summaries—free-text medical documents—annotated with ICD-10 diagnosis and procedure codes. It contains data for patients admitted …

📊 6 results
📏 Metrics: Precision@8, F1 Macro, F1 Micro, Precision@15, R-Prec, mAP, Exact Match Ratio, AUC Macro, AUC Micro

MIMIC-IV ICD-9

MIMIC-IV ICD-9 contains 209,326 discharge summaries—free-text medical documents—annotated with ICD-9 diagnosis and procedure codes. It contains data for patients admitted …

📊 6 results
📏 Metrics: AUC Macro, AUC Micro, Exact Match Ratio, F1 Macro, F1 Micro, Precision@15, Precision@8, R-Prec, mAP

MIMIC-IV-ICD-10-full

The MIMIC-IV-ICD10-full dataset, including occurring labels.

📊 5 results
📏 Metrics: Macro-AUC, Micro-AUC, Macro-F1, Micro-F1, Precision@8

MIMIC-IV-ICD10-top50

The MIMIC-IV-ICD10 dataset, featuring the top 50 most frequently occurring labels.

📊 5 results
📏 Metrics: F1 (micro), F1 (macro), AUC (Micro), AUC (Macro), Precision@5

MIMIC-IV-ICD9-full

The MIMIC-IV-ICD9 dataset, including all occurring labels.

📊 5 results
📏 Metrics: Macro AUC, Micro AUC, F1 Macro, F1 Micro, Precision@8

MIMIC-IV-ICD9-top50

The MIMIC-IV-ICD9 dataset, featuring the top 50 most frequently occurring labels.

📊 5 results
📏 Metrics: AUC Macro, AUC Micro, F1 Macro, F1 Micro, Precision @5

Medical Diagnosis

BreastDICOM4

Several datasets are fostering innovation in higher-level functions for everyone, everywhere. By providing this repository, we hope to encourage the …

📊 1 results
📏 Metrics: Average Precision, Average Recall

Clinical Admission Notes from MIMIC-III

This dataset is created from MIMIC-III (Medical Information Mart for Intensive Care III) and contains simulated patient admission notes. The …

📊 2 results
📏 Metrics: AUROC

Medical Genetics

BIG-bench

The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to probe large language models and extrapolate their …

📊 1 results
📏 Metrics: Accuracy

Medical Image Classification

COVIDGR

Under a close collaboration with an expert radiologist team of the Hospital Universitario San Cecilio, the COVIDGR-1.0 dataset of patients' …

📊 1 results
📏 Metrics: Accuracy

CheXphoto

CheXphoto is a competition for x-ray interpretation based on a new dataset of naturally and synthetically perturbed chest x-rays hosted …

📊 1 results
📏 Metrics: Mean AUC

IDRiD

Indian Diabetic Retinopathy Image Dataset (IDRiD) dataset consists of typical diabetic retinopathy lesions and normal retinal structures annotated at a …

📊 1 results
📏 Metrics: Accuracy, Accuracy (% )

ISIC 2020 Challenge Dataset

The dataset contains 33,126 dermoscopic training images of unique benign and malignant skin lesions from over 2,000 patients. Each image …

📊 1 results
📏 Metrics: AUC

ImageNet

The ImageNet dataset contains 14,197,122 annotated images according to the WordNet hierarchy. Since 2010 the dataset is used in the …

📊 2 results
📏 Metrics: GFLOPs, Top 1 Accuracy

NCT-CRC-HE-100K

The NCT-CRC-HE-100K dataset is a set of 100,000 non-overlapping image patches extracted from 86 H$\&$E stained human cancer tissue slides …

📊 7 results
📏 Metrics: Accuracy (%), F1-Score, Precision, Specificity

Medical Image Generation

ACDC

The goal of the Automated Cardiac Diagnosis Challenge (ACDC) challenge is to: - compare the performance of automatic methods on …

📊 3 results
📏 Metrics: FID

Chest X-Ray Images (Pneumonia)

The normal chest X-ray (left panel) depicts clear lungs without any areas of abnormal opacification in the image. Bacterial pneumonia …

📊 1 results
📏 Metrics: Frechet Inception Distance

ChestX-ray14

ChestX-ray14 is a medical imaging dataset which comprises 112,120 frontal-view X-ray images of 30,805 (collected from the year of 1992 …

📊 1 results
📏 Metrics: FID

Medical Image Registration

IXI

IXI Dataset is a collection of 600 MR brain images from normal, healthy subjects. The MR image acquisition protocol for …

📊 7 results
📏 Metrics: DSC

OASIS

A dataset for single-image 3D in the wild consisting of annotations of detailed 3D geometry for 140,000 images. Source: [OASIS: …

📊 8 results
📏 Metrics: DSC, val dsc

SR-Reg

SR-Reg is a brain MR-CT registration dataset, deriving from SynthRAD 2023 (https://synthrad2023.grand-challenge.org/). This dataset contains 180 subjects preprocessed images, and …

📊 1 results
📏 Metrics: Dice (Average)

Medical Image Segmentation

2018 Data Science Bowl

This dataset contains a large number of segmented nuclei images. The images were acquired under a variety of conditions and …

📊 10 results
📏 Metrics: Dice, mIoU, Recall, Precision, AHD95, ASD

ACDC

The goal of the Automated Cardiac Diagnosis Challenge (ACDC) challenge is to: - compare the performance of automatic methods on …

📊 6 results
📏 Metrics: Dice Score

AMOS

Despite the considerable progress in automatic abdominal multi-organ segmentation from CT/MRI scans in recent years, a comprehensive evaluation of the …

📊 1 results
📏 Metrics: Average Dice

BKAI-IGH NeoPolyp-Small

This dataset contains 1200 images (1000 WLI images and 200 FICE images) with fine-grained segmentation annotations. The training set consists …

📊 9 results
📏 Metrics: Average Dice, mIoU, Average Dice (5-folds), MAE (5-folds), mIoU (5-folds)

Brain US

This brain anatomy segmentation dataset has 1300 2D US scans for training and 329 for testing. A total of 1629 …

📊 3 results
📏 Metrics: F1, IoU

CHASE_DB1

CHASE_DB1 is a dataset for retinal vessel segmentation which contains 28 color retina images with the size of 999×960 pixels …

📊 3 results
📏 Metrics: DSC

CVC-ClinicDB

CVC-ClinicDB is an open-access dataset of 612 images with a resolution of 384×288 from 31 colonoscopy sequences.It is used for …

📊 40 results
📏 Metrics: mean Dice, Average MAE, S-Measure, mIoU, max E-Measure, F-measure

Cell

The CELL benchmark is made of fluorescence microscopy images of cell. Source: Multi-Domain Adversarial Learning Image Source: https://arxiv.org/pdf/1903.09239v1.pdf

📊 1 results
📏 Metrics: IoU

DRIVE

The Digital Retinal Images for Vessel Extraction (DRIVE) dataset is a dataset for retinal vessel segmentation. It consists of a …

📊 4 results
📏 Metrics: mIoU, F1 score, Recall, Specificity, Precision

Electron Microscopy Dataset

The dataset available for download on this webpage represents a 5x5x5µm section taken from the CA1 hippocampus region of the …

📊 1 results
📏 Metrics: AHD95, ASD, Dice, IoU

Endotect Polyp Segmentation Challenge Dataset

A challenge that consists of three tasks, each targeting a different requirement for in-clinic use. The first task involves classifying …

📊 2 results
📏 Metrics: DSC, mIoU, FPS

Extended Task10_Colon Medical Decathlon

A dataset of abdominal CT studies in NifTi format from the open-source medical data repository Medical Decathlon was utilized. To …

📊 1 results
📏 Metrics: Average Dice

GlaS

The dataset used in this challenge consists of 165 images derived from 16 H&E stained histological sections of stage T3 …

📊 9 results
📏 Metrics: F1, IoU, Dice

Kvasir-Instrument

Consists of annotated frames containing GI procedure tools such as snares, balloons and biopsy forceps, etc. Beside of the images, …

📊 2 results
📏 Metrics: DSC, Dice Score, Intersection over Union

Kvasir-SEG

Kvasir-SEG is an open-access dataset of gastrointestinal polyp images and corresponding segmentation masks, manually annotated by a medical doctor and …

📊 51 results
📏 Metrics: mean Dice, Average MAE, S-Measure, max E-Measure, mIoU, FPS, F-measure, Precision, Recall

KvasirCapsule-SEG

The dataset contains a Video capsule endoscopy dataset for polyp segmentation. The dataset can be downloaded from here: https://www.kaggle.com/debeshjha1/kvasircapsuleseg https://www.dropbox.com/home/KvasirCapsule-SEG …

📊 2 results
📏 Metrics: DSC, mIoU

MICCAI 2015 Head and Neck Challenge

This database is provided and maintained by Dr. Gregory C Sharp (Harvard Medical School – MGH, Boston) and his group. …

📊 1 results
📏 Metrics: Dice

MICCAI 2015 Multi-Atlas Abdomen Labeling Challenge

Under Institutional Review Board (IRB) supervision, 50 abdomen CT scans of were randomly selected from a combination of an ongoing …

📊 6 results
📏 Metrics: Avg DSC, Avg HD

Medical Segmentation Decathlon

The Medical Segmentation Decathlon is a collection of medical image segmentation datasets. It contains a total of 2,633 three-dimensional images …

📊 5 results
📏 Metrics: Dice (Average), NSD

Medico automatic polyp segmentation challenge (dataset)

The “Medico automatic polyp segmentation challenge” aims to develop computer-aided diagnosis systems for automatic polyp segmentation to detect all types …

📊 2 results
📏 Metrics: DSC, mIoU, Recall, Precision, FPS

MoNuSAC

Different types of cells play a vital role in the initiation, development, invasion, metastasis and therapeutic response of tumors of …

📊 1 results
📏 Metrics: Dice, IoU

MoNuSeg

The dataset for this challenge was obtained by carefully annotating tissue images of several patients with tumors of different organs …

📊 14 results
📏 Metrics: F1, IoU, AHD95, ASD, mIoU

MosMedData

MosMedData contains anonymised human lung computed tomography (CT) scans with COVID-19 related findings, as well as without such findings. A …

📊 1 results
📏 Metrics: Average Dice

RITE

The RITE (Retinal Images vessel Tree Extraction) is a database that enables comparative studies on segmentation or classification of arteries …

📊 3 results
📏 Metrics: Dice, Jaccard Index

ROBUST-MIS

The ROBUST-MIS dataset was made available to support the Robust Medical Instrument Segmentation (ROBUST-MIS) Challenge 2019, part of the Endoscopic …

📊 3 results
📏 Metrics: DSC, mIoU, FPS

TNBC

Inolves an annotated a large number of cells, including normal epithelial and myoepithelial breast cells (localized in ducts and lobules), …

📊 1 results
📏 Metrics: AHD95, Dice, IoU

Medical Procedure

Clinical Admission Notes from MIMIC-III

This dataset is created from MIMIC-III (Medical Information Mart for Intensive Care III) and contains simulated patient admission notes. The …

📊 3 results
📏 Metrics: AUROC

Medical Relation Extraction

CMeIE

Chinese Medical Information Extraction, a dataset that is also released in CHIP2020, is used for CMeIE task. The task is …

📊 1 results
📏 Metrics: Micro F1

Medical Report Generation

HistGen WSI-Report Dataset

This dataset is composed of 7,753 pairs of whole slide images and their corresponding diagnostic reports, extracted from the TCGA …

📊 1 results
📏 Metrics: BLEU-4

IU X-Ray

IU X-ray (Demner-Fushman et al., 2016) is a set of chest X-ray images paired with their corresponding diagnostic reports. The …

📊 1 results
📏 Metrics: BLEU-4, BLEU-1, BLEU-2, BLEU-3, CIDEr, METEOR, ROUGE

MIMIC-CXR

MIMIC-CXR from Massachusetts Institute of Technology presents 371,920 chest X-rays associated with 227,943 imaging studies from 65,079 patients. The studies …

📊 2 results
📏 Metrics: BLEU-1, BLEU-2, BLEU-3, BLEU-4, CIDEr, Example-F1-14, Example-Precision-14, Example-Recall-14, METEOR, Micro-F1-5, Micro-Precision-5, Micro-Recall-5, ROUGE-L, F1 RadGraph

Molecular Property Prediction

MUV

The Maximum Unbiased Validation (MUV) dataset is a benchmark dataset selected from PubChem BioAssay. It was created by applying a …

📊 2 results
📏 Metrics: ROC-AUC

MoleculeNet

MoleculeNet is a large scale benchmark for molecular machine learning. MoleculeNet curates multiple public datasets, establishes metrics for evaluation, and …

📊 5 results
📏 Metrics: AUC

PCBA

PCBA dataset 11 is a collection of high-quality dose-response data, formulated as a multitask learning benchmark from 128 high-throughput screening …

📊 1 results
📏 Metrics: ROC-AUC

QM7

QM7 dataset is a subset of the GDB-13 database. GDB-13 contains nearly 1 billion stable and synthetically accessible organic molecules. …

📊 7 results
📏 Metrics: MAE

QM8

QM8 dataset is a collection of molecular data used for studying quantum mechanical calculations of electronic spectra and excited state …

📊 7 results
📏 Metrics: MAE

QM9

QM9 provides quantum chemical properties (at DFT level) for a relevant, consistent, and comprehensive chemical space of small organic molecules. …

📊 7 results
📏 Metrics: MAE

SIDER

SIDER contains information on marketed medicines and their recorded adverse drug reactions. The information is extracted from public documents and …

📊 16 results
📏 Metrics: ROC-AUC

Tox21

The Tox21 data set comprises 12,060 training samples and 647 test samples that represent chemical compounds. There are 801 "dense …

📊 17 results
📏 Metrics: ROC-AUC

clintox

The ClinTox dataset compares drugs approved by the FDA and drugs that have failed clinical trials for toxicity reasons. The …

📊 18 results
📏 Metrics: ROC-AUC, Molecules (M)

Molecule Captioning

ChEBI-20

Dataset contains 33,010 molecule-description pairs split into 80\%/10\%/10\% train/val/test splits. The goal of the task is to retrieve the relevant …

📊 28 results
📏 Metrics: BLEU-2, BLEU-4, METEOR, ROUGE-1, ROUGE-2, ROUGE-L, Text2Mol

L+M-24

Language-molecule models have emerged as an exciting direction for molecular discovery and understanding. However, training these models is challenging due …

📊 3 results
📏 Metrics: BLEU-2, BLEU-4, ROUGE-1, ROUGE-2, ROUGE-L, METEOR

Mortality Prediction

Clinical Admission Notes from MIMIC-III

This dataset is created from MIMIC-III (Medical Information Mart for Intensive Care III) and contains simulated patient admission notes. The …

📊 3 results
📏 Metrics: AUROC

MIMIC-III

The Medical Information Mart for Intensive Care III (MIMIC-III) dataset is a large, de-identified and publicly-available collection of medical records. …

📊 13 results
📏 Metrics: F1 score, Precision, Recall, Accuracy

Multi-Label Classification

CheXpert

The CheXpert dataset contains 224,316 chest radiographs of 65,240 patients with both frontal and lateral views available. The task is …

📊 11 results
📏 Metrics: AVERAGE AUC ON 14 LABEL, NUM RADS BELOW CURVE

ChestX-ray14

ChestX-ray14 is a medical imaging dataset which comprises 112,120 frontal-view X-ray images of 30,805 (collected from the year of 1992 …

📊 4 results
📏 Metrics: Average AUC on 14 label, Macro F1

MIMIC-CXR

MIMIC-CXR from Massachusetts Institute of Technology presents 371,920 chest X-rays associated with 227,943 imaging studies from 65,079 patients. The studies …

📊 1 results
📏 Metrics: Average AUC on 14 label

MLRSNet

MLRSNet is a a multi-label high spatial resolution remote sensing dataset for semantic scene understanding. It provides different perspectives of …

📊 2 results
📏 Metrics: F1-score

MRNet

The MRNet dataset consists of 1,370 knee MRI exams performed at Stanford University Medical Center. The dataset contains 1,104 (80.6%) …

📊 1 results
📏 Metrics: Average AUC, AUC on Abnormality (ABN), AUC on ACL Tear (ACL), AUC on Meniscus Tear (MEN), Average Accuracy, Accuracy on Abnormality (ABN), Accuracy on ACL Tear (ACL), Accuracy on Meniscus Tear (MEN)

NUS-WIDE

The NUS-WIDE dataset contains 269,648 images with a total of 5,018 tags collected from Flickr. These images are manually annotated …

📊 9 results
📏 Metrics: MAP

OpenImages-v6

OpenImages V6 is a large-scale dataset , consists of 9 million training images, 41,620 validation samples, and 125,456 test samples. …

📊 4 results
📏 Metrics: mAP

PASCAL VOC 2007

PASCAL VOC 2007 is a dataset for image recognition. The twenty object classes that have been selected are: Person: person …

📊 16 results
📏 Metrics: mAP

Multi-Label Classification Of Biomedical Texts

MIMIC-III

The Medical Information Mart for Intensive Care III (MIMIC-III) dataset is a large, de-identified and publicly-available collection of medical records. …

📊 1 results
📏 Metrics: Micro F1

Multi-tissue Nucleus Segmentation

CoNSeP

The colorectal nuclear segmentation and phenotypes (CoNSeP) dataset consists of 41 H&E stained image tiles, each of size 1,000×1,000 pixels …

📊 2 results
📏 Metrics: Dice, Jaccard Index, PQ

Kumar

The Kumar dataset contains 30 1,000×1,000 image tiles from seven organs (6 breast, 6 liver, 6 kidney, 6 prostate, 2 …

📊 18 results
📏 Metrics: Dice, Hausdorff Distance (mm), Jaccard Index, PQ

Noise Estimation

SIDD

SIDD is an image denoising dataset containing 30,000 noisy images from 10 scenes under different lighting conditions using five representative …

📊 5 results
📏 Metrics: PSNR Gap, Average KL Divergence

Optic Cup Segmentation

REFUGE Challenge

REFUGE Challenge provides a data set of 1200 fundus images with ground truth segmentations and clinical glaucoma labels, currently the …

📊 1 results
📏 Metrics: Dice

Optic Disc Detection

IDRiD

Indian Diabetic Retinopathy Image Dataset (IDRiD) dataset consists of typical diabetic retinopathy lesions and normal retinal structures annotated at a …

📊 1 results
📏 Metrics: Euclidean Distance (ED)

Participant Intervention Comparison Outcome Extraction

EBM-NLP

EBM-NLP annotates PICO (Participants, Interventions, Comparisons and Outcomes) spans in clinical trial abstracts. The corresponding PICO Extraction task aims to …

📊 5 results
📏 Metrics: F1

Pneumonia Detection

Chest X-ray images

Chest X-ray images for pneumonia detection.

📊 3 results
📏 Metrics: Accuracy

ChestX-ray14

ChestX-ray14 is a medical imaging dataset which comprises 112,120 frontal-view X-ray images of 30,805 (collected from the year of 1992 …

📊 4 results
📏 Metrics: AUROC, Params, FLOPS

Promoter Detection

GUE

A collection of $28$ datasets across $7$ tasks constructed for genome language model evaluation. Contains seven tasks: promoter prediction. core …

📊 1 results
📏 Metrics: MCC

Protein Design

CATH 4.2

The CATH (Class, Architecture, Topology, Homology) [65] database is a comprehensive resource for protein structure classification that hierarchical group proteins …

📊 8 results
📏 Metrics: Perplexity, Sequence Recovery %(All)

CATH 4.3

The CATH (Class, Architecture, Topology, Homology) [65] database is a comprehensive resource for protein structure classification that hierarchical group proteins …

📊 2 results
📏 Metrics: Perplexity, Sequence Recovery %(All)

Quantum Machine Learning

iris

The Iris flower data set or Fisher's Iris data set is a multivariate data set introduced by the British statistician, …

📊 1 results
📏 Metrics: Average F1

Respiratory Failure

HiRID

HiRID is a freely accessible critical care dataset containing data relating to almost 34 thousand patient admissions to the Department …

📊 8 results
📏 Metrics: AUPRC, Recall@50

Retinal Vessel Segmentation

CHASE_DB1

CHASE_DB1 is a dataset for retinal vessel segmentation which contains 28 color retina images with the size of 999×960 pixels …

📊 15 results
📏 Metrics: AUC, F1 score, mIOU, Sensitivity, MCC, 1:1 Accuracy, Acc, Average IOU, DSC

DRIVE

The Digital Retinal Images for Vessel Extraction (DRIVE) dataset is a dataset for retinal vessel segmentation. It consists of a …

📊 19 results
📏 Metrics: AUC, F1 score, Accuracy, mIoU, sensitivity, Specificity, MCC, 1:1 Accuracy, Average IOU, DSC

HRF

The HRF dataset is a dataset for retinal vessel segmentation which comprises 45 images and is organized as 15 subsets. …

📊 4 results
📏 Metrics: AUC, F1 score, MCC, mIoU, 1:1 Accuracy, Acc, Average IOU, DSC, Sensitivity

INSPIRE-AVR (LUNet subset)

This dataset contains 65 DFIs acquired from patients with POAG at the University of Iowa Hospitals and Clinics. DFIs were …

📊 1 results
📏 Metrics: Average Dice

STARE

The STARE (Structured Analysis of the Retina) dataset is a dataset for retinal vessel segmentation. It contains 20 equal-sized (700×605) …

📊 9 results
📏 Metrics: AUC, F1 score, mIOU, Sensitivity, Acc, MCC, 1:1 Accuracy, Average IOU, DSC

UZLF

The Leuven-Haifa dataset contains 240 disc-centered fundus images of 224 unique patients (75 patients with normal tension glaucoma, 63 patients …

📊 5 results
📏 Metrics: Average Dice (0.5*Dice_a + 0.5*Dice_v)

Seizure Detection

CHB-MIT

The CHB-MIT dataset is a dataset of EEG recordings from pediatric subjects with intractable seizures. Subjects were monitored for up …

📊 1 results
📏 Metrics: Accuracy

TUH EEG Seizure Corpus

Our goal is to enable deep learning research in neuroscience by releasing the largest publicly available unencumbered database of EEG …

📊 2 results
📏 Metrics: AUROC

Semantic Segmentation

ACDC Scribbles

We release expert-made scribble annotations for the medical ACDC dataset [1]. The released data must be considered as extending the …

📊 6 results
📏 Metrics: Dice (Average)

ADE20K

The ADE20K semantic segmentation dataset contains more than 20K scene-centric images exhaustively annotated with pixel-level objects and object parts labels. …

📊 229 results
📏 Metrics: Validation mIoU, Test Score, Params (M), GFLOPs (512 x 512), GFLOPs, Mean IoU (class)

AI-TOD

AI-TOD comes with 700,621 object instances for eight categories across 28,036 aerial images. Compared to existing object detection datasets in …

📊 2 results
📏 Metrics: Dice

AIRS

The AIRS (Aerial Imagery for Roof Segmentation) dataset provides a wide coverage of aerial imagery with 7.5 cm resolution and …

📊 1 results
📏 Metrics: IoU

ATLANTIS

ATLANTIS is a benchmark for semantic segmentation of waterbody images. This dataset covers a wide range of natural waterbodies such …

📊 1 results
📏 Metrics: A-acc, A-mIoU, Accuracy, mIoU

ApolloScape

ApolloScape is a large dataset consisting of over 140,000 video frames (73 street scene videos) from various locations in China …

📊 2 results
📏 Metrics: mIoU

BIG

A high-resolution semantic segmentation dataset with 50 validation and 100 test objects. Image resolution in BIG ranges from 2048×1600 to …

📊 4 results
📏 Metrics: mBA, IoU

CC3M-TagMask

The dataset offers tag and mask annotations for image-text pairs from the CC3M validation set. Tag annotations denote words that …

📊 4 results
📏 Metrics: mIoU

CEMS-W

The dataset includes annotations for burned area delineation and land cover segmentation, with a focus on European soil. The dataset …

📊 3 results
📏 Metrics: mIoU

COCO (Common Objects in Context)

The COCO (Common Objects in Context) dataset is a large-scale object detection, segmentation, and captioning dataset. It is designed to …

📊 9 results
📏 Metrics: mIoU

COCO-Stuff

The Common Objects in COntext-stuff (COCO-stuff) dataset is a dataset for scene understanding tasks like semantic segmentation, object detection and …

📊 1 results
📏 Metrics: F.W. IU, Per-Class Accuracy, Pixel Accuracy, mIoU

Cam2BEV

The dataset contains two subsets of synthetic, semantically segmented road-scene images, which have been created for developing and applying the …

📊 1 results
📏 Metrics: Mean IoU

CamVid

CamVid (Cambridge-driving Labeled Video Database) is a road/driving scene understanding database which was originally captured as five video sequences with …

📊 20 results
📏 Metrics: Mean IoU, Global Accuracy

Cityscapes

Cityscapes is a large-scale database which focuses on semantic understanding of urban street scenes. It provides semantic, instance-wise, and dense …

📊 2 results
📏 Metrics: mIoU, Pixel Accuracy

Cityscapes 3D

Detecting vehicles and representing their position and orientation in the three dimensional space is a key technology for autonomous driving. …

📊 1 results
📏 Metrics: mIoU

Cityscapes VIPriors subset

The training and validation data are subsets of the training split of the Cityscapes dataset. The test set is taken …

📊 1 results
📏 Metrics: Accuracy, mIoU

DADA-seg

DADA-seg is a pixel-wise annotated accident dataset, which contains a variety of critical scenarios from traffic accidents. It is used …

📊 27 results
📏 Metrics: mIoU

DDD17

DDD17 has over 12 h of a 346x260 pixel DAVIS sensor recording highway and city driving in daytime, evening, night, …

📊 9 results
📏 Metrics: mIoU

DELIVER

DELIVER is an arbitrary-modal segmentation benchmark, covering Depth, LiDAR, multiple Views, Events, and RGB. Aside from this, the dataset is …

📊 9 results
📏 Metrics: mIoU, test mIoU

DIVA-HisDB

The database consists of 150 annotated pages of three different medieval manuscripts with challenging layouts. Furthermore, we provide a layout …

📊 2 results
📏 Metrics: Mean IoU (class)

DSEC

DSEC is a stereo camera dataset in driving scenarios that contains data from two monochrome event cameras and two global …

📊 9 results
📏 Metrics: mIoU

Dark Zurich

Dark Zurich is an image dataset containing a total of 8779 images captured at nighttime, twilight, and daytime, along with …

📊 14 results
📏 Metrics: mIoU

DensePASS

DensePASS - a novel densely annotated dataset for panoramic segmentation under cross-domain conditions, specifically built to study the Pinhole-to-Panoramic transfer …

📊 35 results
📏 Metrics: mIoU

DroneDeploy

From DroneDeploy: We’ve collected a dataset of aerial orthomosaics and elevation images. These have been annotated into 6 different classes: …

📊 1 results
📏 Metrics: Mean IoU (test), Mean IoU (val)

Endoscapes

Cholecystectomy is a very common abdominal surgical procedure almost ubiquitously performed with a laparoscopic approach, hence guided by an endoscopic …

📊 2 results
📏 Metrics: Mean F1

FLAIR (French Land cover from Aerospace ImageRy)

The French National Institute of Geographical and Forest Information (IGN) has the mission to document and measure land-cover on French …

📊 4 results
📏 Metrics: mIoU

FMB Dataset

FMB contains 1500 well-registered infrared and visible image pairs with 14 annotated pixel-level categories. Also, it covers a wide range …

📊 13 results
📏 Metrics: mIoU

Fine-Grained Cloud Segmentation Dataset

The dataset consists of 96 terrain-corrected (Level-1T) scenes from Landsat 8 OLI and TIRS, covering diverse biomes. This variety supports …

📊 3 results
📏 Metrics: mIoU

Fine-Grained Grass Segmentation Dataset

The dataset was created using high-resolution (8 m) satellite imagery from the Gaofen series (Gaofen-2 and Gaofen-6), captured in 2019 …

📊 9 results
📏 Metrics: mIoU

FoodSeg103

FoodSeg103 is a new food image dataset containing 7,118 images. Images are annotated with 104 ingredient classes and each image …

📊 7 results
📏 Metrics: mIoU

Forward-Looking Sonar Marine Debris Datasets

This dataset is made up of forward-looking sonar images containing ten classes of underwater debris. The dataset can be used …

📊 1 results
📏 Metrics: mIOU

Freiburg Forest

The Freiburg Forest dataset was collected using a Viona autonomous mobile robot platform equipped with cameras for capturing multi-spectral and …

📊 2 results
📏 Metrics: Mean IoU

HAM10000

HAM10000 is a dataset of 10000 training images for detecting pigmented skin lesions. The authors collected dermatoscopic images from different …

📊 1 results
📏 Metrics: Average Dice, Average IOU

HERA RFI Detection

This dataset contains simulated and expert-labelled spectrograms from two radio telescopes: the Hydrogen Epoch of Reionization Array (HERA) in South …

📊 2 results
📏 Metrics: AUPRC, AUROC, F1

Hypersim

For many fundamental scene understanding tasks, it is difficult or impossible to obtain per-pixel ground truth labels from real images. …

📊 5 results
📏 Metrics: mIoU, mIoU (test)

INRIA Aerial Image Labeling

The INRIA Aerial Image Labeling dataset is comprised of 360 RGB tiles of 5000×5000px with a spatial resolution of 30cm/px …

📊 6 results
📏 Metrics: IoU, mIOU

ISPRS Potsdam

The data set contains 38 patches (of the same size), each consisting of a true orthophoto (TOP) extracted from a …

📊 17 results
📏 Metrics: Overall Accuracy, Mean F1, Mean IoU

ISPRS Vaihingen

The data set contains 33 patches (of different sizes), each consisting of a true orthophoto (TOP) extracted from a larger …

📊 10 results
📏 Metrics: Overall Accuracy, Average F1, Category mIoU

ImageNet-S

Powered by the ImageNet dataset, unsupervised learning on large-scale data has made significant advances for classification tasks. There are two …

📊 20 results
📏 Metrics: mIoU (val), mIoU (test)

KITTI-360

KITTI-360 is a large-scale dataset that contains rich sensory information and full annotations. It is the successor of the popular …

📊 14 results
📏 Metrics: mIoU

Kvasir-Instrument

Consists of annotated frames containing GI procedure tools such as snares, balloons and biopsy forceps, etc. Beside of the images, …

📊 2 results
📏 Metrics: DSC, mIoU

LOFAR RFI Detection

This dataset contains simulated and expert-labelled spectrograms from two radio telescopes: the Hydrogen Epoch of Reionization Array (HERA) in South …

📊 2 results
📏 Metrics: AUPRC, AUROC, F1

LaRS

LaRS is the largest and most diverse panoptic maritime obstacle detection dataset. Highlights: * Diverse scenes from manual capture, public …

📊 20 results
📏 Metrics: Q, F1, μ, mIoU

LoveDA

  1. 5987 high spatial resolution (0.3 m) remote sensing images from Nanjing, Changzhou, and Wuhan 2. Focus on different geographical …
📊 16 results
📏 Metrics: Category mIoU

MCubeS

Multimodal material segmentation (MCubeS) dataset contains 500 sets of images from 42 street scenes. Each scene has images for four …

📊 21 results
📏 Metrics: mIoU

MCubeS (P)

Multimodal material segmentation (MCubeS) dataset contains 500 sets of images from 42 street scenes. Each scene has images for four …

📊 8 results
📏 Metrics: mIoU

MUSES: MUlti-SEnsor Semantic perception dataset

MUSES offers 2500 multi-modal scenes, evenly distributed across various combinations of weather conditions (clear, fog, rain, and snow) and types …

📊 2 results
📏 Metrics: mIoU

Matterport3D

The Matterport3D dataset is a large RGB-D dataset for scene understanding in indoor environments. It contains 10,800 panoramic views inside …

📊 4 results
📏 Metrics: Test mIoU, Validation mIoU

Mila Simulated Floods

Mila Simulated Floods Dataset is a 1.5 square km virtual world using the Unity3D game engine including urban, suburban and …

📊 1 results
📏 Metrics: mIoU

MixedWM38

MixedWM38 Dataset(WaferMap) has more than 38000 wafer maps, including 1 normal pattern, 8 single defect patterns, and 29 mixed defect …

📊 1 results
📏 Metrics: Dice, Mean IoU

Montgomery County X-ray Set

X-ray images in this data set have been acquired from the tuberculosis control program of the Department of Health andHuman …

📊 3 results
📏 Metrics: F1-score

Nighttime Driving

Nighttime Driving is a dataset of road scenes consisting of 35,000 images ranging from daytime to twilight time and to …

📊 12 results
📏 Metrics: mIoU

OpenEDS

OpenEDS (Open Eye Dataset) is a large scale data set of eye-images captured using a virtual-reality (VR) head mounted display …

📊 1 results
📏 Metrics: mIOU

PASCAL Context

The PASCAL Context dataset is an extension of the PASCAL VOC 2010 detection challenge, and it contains pixel-wise labels for …

📊 62 results
📏 Metrics: mIoU, Mean Accuracy, Pixel Accuracy

PASCAL VOC

The PASCAL Visual Object Classes (VOC) 2012 dataset contains 20 object categories including vehicles, household, animals, and other: aeroplane, bicycle, …

📊 1 results
📏 Metrics: mIoU

PASCAL VOC 2007

PASCAL VOC 2007 is a dataset for image recognition. The twenty object classes that have been selected are: Person: person …

📊 2 results
📏 Metrics: Mean IoU

PASCAL VOC 2011

PASCAL VOC 2011 is an image segmentation dataset. It contains around 2,223 images for training, consisting of 5,034 objects. Testing …

📊 1 results
📏 Metrics: Mean IoU

PASCAL VOC 2012 test

SCC Data Set

📊 51 results
📏 Metrics: Mean IoU, FLOPS, Params

PASTIS

PASTIS is a benchmark dataset for panoptic and semantic segmentation of agricultural parcels from satellite image time series. It is …

📊 3 results
📏 Metrics: Mean IoU (test), Number of Params, Overall Accuracy

PASTIS-R

Extension of the PASTIS benchmark with radar and optical image time series.

📊 1 results
📏 Metrics: IoU

PETRAW

PETRAW data set was composed of 150 sequences of peg transfer training sessions. The objective of the peg transfer session …

📊 4 results
📏 Metrics: Mean IoU (class)

PH2

The increasing incidence of melanoma has recently promoted the development of computer-aided diagnosis systems for the classification of dermoscopic images. …

📊 2 results
📏 Metrics: Average Dice, Average IOU

Pothole Mix

This dataset for the semantic segmentation of potholes and cracks on the road surface was assembled from 5 other datasets …

📊 7 results
📏 Metrics: Test Dice Multiclass, Test mIoU, Validation Dice Multiclass, Validation mIoU

Potsdam

https://paperswithcode.com/sota/semantic-segmentation-on-isprs-potsdam

📊 3 results
📏 Metrics: mIoU

RUGD

A Video Dataset for Visual Perception and Autonomous Navigation in Unstructured Environments. Website: http://rugd.vision/ The RUGD dataset focuses on semantic …

📊 1 results
📏 Metrics: AIOU, mIoU

Replica

The Replica Dataset is a dataset of high quality reconstructions of a variety of indoor spaces. Each reconstruction has clean …

📊 5 results
📏 Metrics: mIoU

S3DIS

The Stanford 3D Indoor Scene Dataset (S3DIS) dataset contains 6 large-scale indoor areas with 271 rooms. Each point in the …

📊 50 results
📏 Metrics: Mean IoU, mAcc, oAcc, FLOPs, Number of params, mIoU, Params (M)

SBCoseg

The SBCoseg dataset includes 889 groups of images and each group consists of 18 images with a common object, leading …

📊 1 results
📏 Metrics: Jaccard

STARE

The STARE (Structured Analysis of the Retina) dataset is a dataset for retinal vessel segmentation. It contains 20 equal-sized (700×605) …

📊 1 results
📏 Metrics: AUC

SWIMSEG

The SWIMSEG dataset contains 1013 images of sky/cloud patches, along with their corresponding binary segmentation maps. The ground truth annotation …

📊 1 results
📏 Metrics: Average Precision, Average Recall, F1-Score, MCC, Mean IoU

SWINSEG

The SWINSEG dataset contains 115 nighttime images of sky/cloud patches along with their corresponding binary ground truth maps. The ground …

📊 1 results
📏 Metrics: Average Precision, Average Recall, F1-Score, MCC, Mean IoU

SWINySEG

The SWINySEG dataset contains 6768 daytime- and nighttime-images of sky/cloud patches along with their corresponding binary ground truth maps. The …

📊 1 results
📏 Metrics: Average Precision, Average Recall, F1-Score, MCC, Mean IoU

SYNTHIA

The SYNTHIA dataset is a synthetic dataset that consists of 9400 multi-viewpoint photo-realistic frames rendered from a virtual city and …

📊 2 results
📏 Metrics: mIoU

ScanNet

ScanNet is an instance-level indoor RGB-D dataset that includes both 2D and 3D data. It is a collection of labeled …

📊 44 results
📏 Metrics: val mIoU, test mIoU

Semantic3D

Semantic3D is a point cloud dataset of scanned outdoor scenes with over 3 billion points. It contains 15 training and …

📊 13 results
📏 Metrics: mIoU, oAcc

SemanticPOSS

The SemanticPOSS dataset for 3D semantic segmentation contains 2988 various and complicated LiDAR scans with large quantity of dynamic instances. …

📊 1 results
📏 Metrics: Mean IoU

ShapeNet

ShapeNet is a large scale repository for 3D CAD models developed by researchers from Stanford University, Princeton University and the …

📊 4 results
📏 Metrics: Mean IoU

SpaceNet 1

SpaceNet 1: Building Detection v1 is a dataset for building footprint detection. The data is comprised of 382,534 building footprints, …

📊 10 results
📏 Metrics: Mean IoU

Structured3D

Structured3D is a large-scale photo-realistic dataset containing 3.5K house designs (a) created by professional designers with a variety of ground …

📊 4 results
📏 Metrics: Test mIoU, Validation mIoU

Trans10K

A large-scale dataset for transparent object segmentation, named Trans10K, consisting of 10,428 images of real scenarios with carefully manual annotations, …

📊 14 results
📏 Metrics: mIoU, GFLOPs

UAVid

UAVid is a high-resolution UAV semantic segmentation dataset as a complement, which brings new challenges, including large scale variation, moving …

📊 6 results
📏 Metrics: Mean IoU

UPLight

UPLight is an underwater RGB-Polarization multimodal semantic segmentation dataset with 12 typical underwater semantic classes.

📊 6 results
📏 Metrics: mIoU

VDD

Semantic segmentation of drone images is critical for various aerial vision tasks as it provides essential seman- tic details to …

📊 7 results
📏 Metrics: mIoU

WildDash

WildDash is a benchmark evaluation method is presented that uses the meta-information to calculate the robustness of a given algorithm …

📊 1 results
📏 Metrics: Mean IoU

ZJU-RGB-P

Research on semantic segmentation of traffic scenes using color and polarization information (including training and testing sets).

📊 13 results
📏 Metrics: mIoU, Frame (fps)

iSAID

iSAID contains 655,451 object instances for 15 categories across 2,806 high-resolution images. The images of iSAID is the same as …

📊 15 results
📏 Metrics: mIoU

Single-step retrosynthesis

USPTO-50k

Subset and preprocessed version of Chemical reactions from US patents (1976-Sep2016) by Daniel Lowe. It includes 50K randomly selected reactions …

📊 23 results
📏 Metrics: Top-1 accuracy, Top-3 accuracy, Top-5 accuracy, Top-10 accuracy, Top-20 accuracy, Top-50 accuracy

Sleep Stage Detection

ISRUC-Sleep

ISRUC-Sleep is a polysomnographic (PSG) dataset. The data were obtained from human adults, including healthy subjects, and subjects with sleep …

📊 2 results
📏 Metrics: Accuracy, AUROC, Kappa, Macro-F1

Montreal Archive of Sleep Studies

The Montreal Archive of Sleep Studies (MASS) is an open-access and collaborative database of laboratory-based polysomnography (PSG) recordings O’Reilly, C., …

📊 2 results
📏 Metrics: Accuracy, Cohen's kappa, Macro-F1

PhysioNet Challenge 2018

Data for this challenge were contributed by the Massachusetts General Hospital’s (MGH) Computational Clinical Neurophysiology Laboratory (CCNL), and the Clinical …

📊 2 results
📏 Metrics: Accuracy, Cohen's Kappa, Macro-F1

SHHS

The Sleep Heart Health Study (SHHS) is a multi-center cohort study implemented by the National Heart Lung & Blood Institute …

📊 7 results
📏 Metrics: Accuracy, Cohen's Kappa, Macro-F1

Sleep-EDF

The sleep-edf database contains 197 whole-night PolySomnoGraphic sleep recordings, containing EEG, EOG, chin EMG, and event markers. Some records also …

📊 8 results
📏 Metrics: Accuracy, Cohen's kappa, Macro-F1

Splice Site Prediction

GUE

A collection of $28$ datasets across $7$ tasks constructed for genome language model evaluation. Contains seven tasks: promoter prediction. core …

📊 1 results
📏 Metrics: MCC

Surgical Skills Evaluation

JIGSAWS

The JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS) is a surgical activity dataset for human motion modeling. The data …

📊 2 results
📏 Metrics: Accuracy, Edit Distance

Synthetic Data Generation

UNSW-NB15

UNSW-NB15 is a network intrusion dataset. It contains nine different attacks, includes DoS, worms, Backdoors, and Fuzzers. The dataset contains …

📊 2 results
📏 Metrics: EMD

Text-based de novo Molecule Generation

ChEBI-20

Dataset contains 33,010 molecule-description pairs split into 80\%/10\%/10\% train/val/test splits. The goal of the task is to retrieve the relevant …

📊 19 results
📏 Metrics: BLEU, Exact Match, Frechet ChemNet Distance (FCD), Levenshtein, MACCS FTS, Morgan FTS, RDK FTS, Text2Mol, Validity, Parameter Count