Awesome
Mothra
Requirements
- Python==3.9, some errors show up in newer version of python. NOTICE:
init.sh
uses pyenv, if you have not installed it, you should installpython3.9-venv
- Keras (version 2.0.5) If you installed the newest version of keras, some errors will show up. Please change it back to keras 2.0.5 by pip install keras==2.0.5.
- (*Optional but Highly recommended) CUDA (version 11.7) , cuDNN (version 8 for CUDA 11.x)
- tensoflow-gpu (version 1.15.2, ver>=2.0 occurred error.)
- rdkit
- rDock
- Autodock Vina Make sure to add Vina into system path.
- Open Babel Make sure to add OpenBabel into system path.
- eToxPred DL and untar https://github.com/pulimeng/eToxPred/raw/master/etoxpred_best_model.tar.gz into ligand_design/ for using toxcity prediction(Optional)
For installing Keras, rdkit, and other dependencies by pip
on Virtual ENVironment, We provide requirements.txt
and init.sh
in init
dir. After installing python, you may run bash inits/init.sh
.
How to Use
Train the RNN model
- Run
python train_RNN/train_RNN.py
to train the RNN model. Pretrained model is provided inmodel/model.h5
Molecule generate
- Run
python ligand_design/mcts_ligand.py data_dir
Although MOMCTS-MolGen has an extendable objective set, the default setting of objectives is docking score, QED score, logP, and a filter on SA score.
To modify your own objective set, change simulation functions in add_node_type.py, and change reward functions in mcts_ligand.py. (it may integrate into one function in future work)
If the size of the objective set is not 3, don't forget to change 'default_reward' in mcts_ligand.py.
Outputs of ligand_design process will store in data/present/, including:
output.txt ## output of pareto front change
ligands.txt ## ligands pass SA score filter.
scores.txt ## raw scores of ligands
hverror_output.txt ## output of hypervolume calculation errors
error_output.txt ## output of vina and obabel errors
directory structure
.
├─data : for pretrain dataset
├─data_template : template directory for ligand generation
│ ├─input : set target protein(s) for docking on VINA and configure generation
│ ├─output : save generated ligands
│ ├─present : save valid generated ligands and their scores
│ └─workspace : a room for docking on each ligand
├─ligand_design : source code for ligand generation
├─model : save an RNN generative model.
└─train_RNN : train an RNN generative model.
License
This package is distributed under the GPL License.