Home

Awesome

Update: Our code has been moved into AIRS. Please refer to AIRS for any future updates. This repo is no longer maintained.

Generating 3D Molecules for Target Protein Binding

This is the official implementation of the GraphBP method proposed in the following paper.

Meng Liu, Youzhi Luo, Kanji Uchino, Koji Maruhashi, and Shuiwang Ji. "Generating 3D Molecules for Target Protein Binding". [ICML 2022 Long Presentation]

Requirements

We include key dependencies below. The versions we used are in the parentheses. Our detailed environmental setup is available in environment.yml.

Preparing Data

wget https://bits.csb.pitt.edu/files/crossdock2020/v1.1/CrossDocked2020_v1.1.tgz -P data/crossdock2020/
tar -C data/crossdock2020/ -xzf data/crossdock2020/CrossDocked2020_v1.1.tgz
wget https://bits.csb.pitt.edu/files/it2_tt_0_lowrmsd_mols_train0_fixed.types -P data/crossdock2020/
wget https://bits.csb.pitt.edu/files/it2_tt_0_lowrmsd_mols_test0_fixed.types -P data/crossdock2020/

Note: (1) The unzipping process could take a lot of time. Unzipping on SSD is much faster!!! (2) Several samples in the training set cannot be processed by our code. Hence, we recommend replacing the it2_tt_0_lowrmsd_mols_train0_fixed.types file with a new one, where these samples are deleted. The new one is available here.

python scripts/split_sdf.py data/crossdock2020/it2_tt_0_lowrmsd_mols_train0_fixed.types data/crossdock2020
python scripts/split_sdf.py data/crossdock2020/it2_tt_0_lowrmsd_mols_test0_fixed.types data/crossdock2020

Run

CUDA_VISIBLE_DEVICES=${you_gpu_id} python main.py

Note: GraphBP can be trained on a 48GB GPU with batchsize=16. Our trained model is available here.

CUDA_VISIBLE_DEVICES=${you_gpu_id} python main_gen.py
CUDA_VISIBLE_DEVICES=${you_gpu_id} python main_eval.py

Reference

@inproceedings{liu2022graphbp,
  title={Generating 3D Molecules for Target Protein Binding},
  author={Meng Liu and Youzhi Luo and Kanji Uchino and Koji Maruhashi and Shuiwang Ji},
  booktitle={International Conference on Machine Learning},
  year={2022}
}

Acknowledgments

This work was supported in part by National Science Foundation grants IIS-2006861 and IIS-1908220.