Awesome
ReBADD-SE: Multi-objective Molecular Optimisation using SELFIES Fragment and Off-Policy Self-critical Sequence Training
This is the repository for ReBADD-SE, a multi-objective molecular optimization model that designs a molecular structures in the format of SELFIES. For more details, please refer to our paper.
- Latest update: 26 Jan 2024
Install
conda env create -f environment.yml
Task Descriptions
- TASK1: ReBADD-SE for GSK3b, JNK3, QED, and SA (frag-level)
- TASK3: ReBADD-SE for BCL2, BCLXL, and BCLW (frag-level)
- TASK4: ReBADD-SE for BCL2, BCLXL, and BCLW (char-level)
- TASK7: SELFIES Collapse Analaysis between ReBADD-SE (frag-level) and GA+D
Notebook Descriptions
0_preprocess_data.ipynb
- (Important!) Before starting any TASK, please first run the scripts in the directory 'data/chembl' or 'data/zinc15'
- Read the training data
- Preprocess the data for model training
1_pretraining.ipynb
- Read the training data
- The generator learns the grammar rules of SELFIES
2_optimize+{objectives}.ipynb
- (Important!) Please check first the 'ReBADD_config.py' in which a reward function have to be defined appropriately
- Load the pretrained generator
3_checkpoints+{objectives}.ipynb
- Load the checkpoints stored during optimization
- Sample molecules for each checkpoint
4_calculate_properties.ipynb
- For each checkpoint, load the sampled molecules
- Evaluate their property scores
5_evaluate_checkpoints.ipynb
- Calculate metrics (e.g. success rate)
- Find the best checkpoint
Note
If you have any further questions, please do not hesitate to let me know.
jonghwanc@hallym.ac.kr
Citation
@article{CHOI2023106721,
title = {ReBADD-SE: Multi-objective molecular optimisation using SELFIES fragment and off-policy self-critical sequence training},
journal = {Computers in Biology and Medicine},
volume = {157},
pages = {106721},
year = {2023},
issn = {0010-4825},
doi = {https://doi.org/10.1016/j.compbiomed.2023.106721},
url = {https://www.sciencedirect.com/science/article/pii/S0010482523001865},
author = {Jonghwan Choi and Sangmin Seo and Seungyeon Choi and Shengmin Piao and Chihyun Park and Sung Jin Ryu and Byung Ju Kim and Sanghyun Park},
keywords = {Drug discovery, De novo drug design, Multi-objective optimisation, SELFIES, Reinforcement learning}
}