Home

Awesome

HYDRA

This repository is the official implementation of the paper "A Hybrid Diffusion Model for Stable, Affinity-Driven, Receptor-Aware Peptide Generation".

<p align="center"> <img src="assets/flow.png"> </p>

Requirements

HYDRA was developed using PyTorch 1.13.1 on Python 3.8.18 which remain the preferred versions to reproduce the code. A virtual environment manager such as Conda is recommended.

Install dependencies:

conda env create -f environment.yml

Activate the environment:

conda activate HYDRA

Data

The model was trained on release-2020-03-18 of PepBDB.
It can be obtained by running:

wget http://huanglab.phys.hust.edu.cn/pepbdb/db/download/pepbdb-20200318.tgz

Scripts to clean and preprocess the dataset for usage with HYDRA is provided in the utils/datasets/ directory and can be used as follows:

python3 utils/datasets/clean_pepbdb.py --source /path/to/pepbdb --dest /path/to/pepbdb --n_atom_thr 200
python3 utils/datasets/process_pepbdb.py --source /path/to/pepbdb --dest /path/to/pepbdb_natoms200_pocket10 --radius 10

Pre-trained Model Checkpoint

You can download the model weights used in the paper here for inference.

Configuration

Configuration management for HYDRA is done through multiple .yml files located in the configs/ directory.

These configurations might have to be modified to point to the dataset and checkpoint path on your system before proceeding with training and evaluation.

Training

To train HYDRA, run:

python3 scripts/train.py configs/train.yml

Optionally, you may specify the parameter --num_gpus N in order to perform multi-GPU training using the Distributed Data Parallel strategy.
The parameter --ckpt can be used to resume training from an existing checkpoint.

Evaluation

Evaluating HYDRA must be done in two stages:

  1. Sampling residues based on the target receptor.
  2. Reconstructing generated residues into peptides.

1.1 Sampling for all receptors in the testset

python3 scripts/sample-testset.py configs/train.yml configs/sample.yml --out_dir ./outputs

1.2 Sampling for a receptor from PDB

python3 scripts/sample-pdb.py data/pfemp1/PF3D71150400_MEDIUM.pdb configs/train.yml configs/sample.yml --out_dir ./outputs

2. Reconstructing sampled residues into peptides

python3 scripts/reconstruct.py ./outputs configs/reconstruct.yml

Contributing

All source code within this repository is licensed under the MIT License, please see the LICENSE file for more details.