Awesome
MolRL-MGPT
This is the code repository for our paper published in NeurIPS 2023: De novo Drug Design using Reinforcement Learning with Multiple GPT Agents.
Dependencies
pytorch==1.12.1
rdkit==2020.03
tqdm
tensorboard
multiprocessing
PyTDC
openbabel
Dataset & Docking
Following Chemformer, we use a filtered ZINC dataset containing 100M SMILES. The files are available at MolecularAI/Chemformer.
The ChEMBL dataset is available at ChEMBL.
The SMILES vocabulary and protein structures can be found in data/
.
Quick Vina 2 is available at QuickVina.
Pre-training
python codes/pretrain.py
Multi-agent Reinforcement Learning
GuacaMol benchmark
python codes/MARL.py --task_id 0
SARS-COV-2 protein targets
python codes/MARL.py --oracle docking_PLPro_7JIR_mpo
python codes/MARL.py --oracle docking_RdRp_mpo
Citation
@article{hu2024novo,
title={De novo Drug Design using Reinforcement Learning with Multiple GPT Agents},
author={Hu, Xiuyuan and Liu, Guoqing and Zhao, Yang and Zhang, Hao},
journal={Advances in Neural Information Processing Systems},
volume={36},
year={2024}
}