GEOM-DRUGS

Dataset Information
Modalities
Graphs, 3D
Introduced
2023
License
Unknown
Homepage

Overview

GEOM-DRUGS is a dataset of 430,000 large organic molecules of up to 180 atoms from Axelrod and Gómez-Bombarelli, Nature Scientific Data, 2022.

The dataset used by most machine learning papers is a processed version from: link or link.

The original, unprocessed dataset is available: link.

Variants: GEOM-DRUGS

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Unconditional Molecule Generation TABASCO TABASCO: A Fast, Simplified Model … 2025-07-01
Unconditional Molecule Generation ADiT All-atom Diffusion Transformers: Unified generative … 2025-03-05
Unconditional Molecule Generation SemlaFlow SemlaFlow -- Efficient 3D Molecular … 2024-06-11
Unconditional Molecule Generation EQGAT-diff Navigating the Design Space of … 2023-09-29
Unconditional Molecule Generation MiDi MiDi: Mixed Graph and 3D … 2023-02-17

Research Papers

Recent papers with results on this dataset: