GEOM-DRUGS is a dataset of 430,000 large organic molecules of up to 180 atoms from Axelrod and Gómez-Bombarelli, Nature Scientific Data, 2022.
The dataset used by most machine learning papers is a processed version from: link or link.
The original, unprocessed dataset is available: link.
Variants: GEOM-DRUGS
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Unconditional Molecule Generation | TABASCO | TABASCO: A Fast, Simplified Model … | 2025-07-01 |
Unconditional Molecule Generation | ADiT | All-atom Diffusion Transformers: Unified generative … | 2025-03-05 |
Unconditional Molecule Generation | SemlaFlow | SemlaFlow -- Efficient 3D Molecular … | 2024-06-11 |
Unconditional Molecule Generation | EQGAT-diff | Navigating the Design Space of … | 2023-09-29 |
Unconditional Molecule Generation | MiDi | MiDi: Mixed Graph and 3D … | 2023-02-17 |
Recent papers with results on this dataset: