Awesome
Molecular Geometry Prediction using a Deep Generative Graph Neural Network
TensorFlow implementation of the models described in the paper Molecular Geometry Prediction using a Deep Generative Graph Neural Network.
We present code for training deep generative models of molecular geometry (conformations), as well as preprocessed datasets and pretrained models.
Dependencies
Python
- Python 3.6
- TensorFlow 1.2
- RDKit 2018.09.1
- tensorboardX
GPU
- CUDA (we recommend using the latest version. The version 9.0 was used in all our experiments.)
Downloading Datasets & Pre-trained Models
Note: Due to licensing issues, we can't release preprocessed CSD dataset. However, if you have a license to use CSD dataset please email us and we will send you the preprocessed dataset.
Dataset | Model | |
---|---|---|
QM9 | Data | Model |
COD | Data | Model |
Training Conditional Variational Graph Auto Encoder (CVGAE)
QM9
python PredX_train.py --data QM9 --mpnn-steps 3
COD
python PredX_train.py --data COD --mpnn-steps 5
Loading & Generation from Pre-trained Models
QM9
python PredX_train.py --data QM9 --loaddir qm9_model/neuralnet_model_best.ckpt --test --mpnn_steps 3
COD
python PredX_train.py --data COD --loaddir cod_model/neuralnet_model_best.ckpt --test --mpnn_steps 5
Running Force-Field Baselines (ETKDG + MMFF/UFF)
QM9/COD/CSD
python baseline.py --data QM9 --num-total-samples 100 --num-parallel-samples 10 --num-threads 10
Running Force-Field Baselines where initial atom coordinates are provided by neural network (CVGAE + MMFF)
QM9/COD/CSD
python baseline_nn.py --data QM9 --nn-path /path/to/qm9_cvgae_confs
Notes:
--nn-path
points to the folder containing saved conformations generated by CVGAE.
These saved conformations by CVGAE can be obtained by adding --savepermol
argument during loading/generation stage
Example (QM9): python PredX_train.py --data QM9 --loaddir qm9_model/neuralnet_model_best.ckpt --savepermol --test --mpnn_steps 3
Instead of saving conformations by CVGAE and loading them separately, you can also run CVGAE + MMFF all together by adding --useFF
argument during loading/generation stage
Example (QM9):
python PredX_train.py --data QM9 --loaddir qm9_model/neuralnet_model_best.ckpt --useFF --test --mpnn_steps 3
Additional Scripts
QM9_featurize.py
, QM9_sdf_to_p.py
, COD_featurize.py
, COD_sdf_to_p.py
, CSD_featurize.py
, CSD_sdf_to_p.py
are scripts for preprocessing QM9, COD, CSD datasets respectively
Citation
If you find the resources in this repository useful, please consider citing:
@article{Mansimov:19,
author = {Elman Mansimov and Omar Mahmood and Seokho Kang and Kyunghyun Cho},
title = {Molecular Geometry Prediction using a Deep Generative Graph Neural Network},
year = {2019},
journal = {arXiv preprint arXiv:1904.00314},
}