Home

Awesome

Molecular Geometry Prediction using a Deep Generative Graph Neural Network

TensorFlow implementation of the models described in the paper Molecular Geometry Prediction using a Deep Generative Graph Neural Network.

We present code for training deep generative models of molecular geometry (conformations), as well as preprocessed datasets and pretrained models.

Dependencies

Python

GPU

Downloading Datasets & Pre-trained Models

Note: Due to licensing issues, we can't release preprocessed CSD dataset. However, if you have a license to use CSD dataset please email us and we will send you the preprocessed dataset.

DatasetModel
QM9DataModel
CODDataModel

Training Conditional Variational Graph Auto Encoder (CVGAE)

QM9

python PredX_train.py --data QM9 --mpnn-steps 3

COD

python PredX_train.py --data COD --mpnn-steps 5

Loading & Generation from Pre-trained Models

QM9

python PredX_train.py --data QM9 --loaddir qm9_model/neuralnet_model_best.ckpt --test --mpnn_steps 3

COD

python PredX_train.py --data COD --loaddir cod_model/neuralnet_model_best.ckpt --test --mpnn_steps 5

Running Force-Field Baselines (ETKDG + MMFF/UFF)

QM9/COD/CSD

python baseline.py --data QM9 --num-total-samples 100 --num-parallel-samples 10 --num-threads 10

Running Force-Field Baselines where initial atom coordinates are provided by neural network (CVGAE + MMFF)

QM9/COD/CSD

python baseline_nn.py --data QM9 --nn-path /path/to/qm9_cvgae_confs Notes: --nn-path points to the folder containing saved conformations generated by CVGAE. These saved conformations by CVGAE can be obtained by adding --savepermol argument during loading/generation stage Example (QM9): python PredX_train.py --data QM9 --loaddir qm9_model/neuralnet_model_best.ckpt --savepermol --test --mpnn_steps 3

Instead of saving conformations by CVGAE and loading them separately, you can also run CVGAE + MMFF all together by adding --useFF argument during loading/generation stage Example (QM9): python PredX_train.py --data QM9 --loaddir qm9_model/neuralnet_model_best.ckpt --useFF --test --mpnn_steps 3

Additional Scripts

QM9_featurize.py, QM9_sdf_to_p.py, COD_featurize.py, COD_sdf_to_p.py, CSD_featurize.py, CSD_sdf_to_p.py are scripts for preprocessing QM9, COD, CSD datasets respectively

Citation

If you find the resources in this repository useful, please consider citing:

@article{Mansimov:19,
  author    = {Elman Mansimov and Omar Mahmood and Seokho Kang and Kyunghyun Cho},
  title     = {Molecular Geometry Prediction using a Deep Generative Graph Neural Network},
  year      = {2019},
  journal   = {arXiv preprint arXiv:1904.00314},
}