QM7

Dataset Information
Introduced
2020
License
Unknown
Homepage

Overview

QM7 dataset is a subset of the GDB-13 database. GDB-13 contains nearly 1 billion stable and synthetically accessible organic molecules. In the QM7 subset, only molecules with up to 23 atoms are included. These atoms consist of carbon ©, nitrogen (N), oxygen (O), and sulfur (S). The total number of molecules in the QM7 dataset is 7165. Each molecule is represented using the Coulomb matrix, which captures the interactions between atoms.

Variants: QM7

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Molecular Property Prediction ChemRL-GEM ChemRL-GEM: Geometry Enhanced Molecular Representation … 2021-06-11
Molecular Property Prediction GROVER (large) Self-Supervised Graph Transformer on Large-Scale … 2020-06-18
Molecular Property Prediction GROVER (base) Self-Supervised Graph Transformer on Large-Scale … 2020-06-18
Molecular Property Prediction PretrainGNN Strategies for Pre-training Graph Neural … 2019-05-29
Molecular Property Prediction D-MPNN Analyzing Learned Molecular Representations for … 2019-04-02
Molecular Property Prediction N-GramRF N-Gram Graph: Simple Unsupervised Representation … 2018-06-24
Molecular Property Prediction N-GramXGB N-Gram Graph: Simple Unsupervised Representation … 2018-06-24

Research Papers

Recent papers with results on this dataset: