


This repository is the official implementation of the paper "A Hybrid Diffusion Model for Stable, Affinity-Driven, Receptor-Aware Peptide Generation".

<p align="center"> <img src="assets/flow.png"> </p>


HYDRA was developed using PyTorch 1.13.1 on Python 3.8.18 which remain the preferred versions to reproduce the code. A virtual environment manager such as Conda is recommended.

Install dependencies:

conda env create -f environment.yml

Activate the environment:

conda activate HYDRA


The model was trained on release-2020-03-18 of PepBDB.
It can be obtained by running:

wget http://huanglab.phys.hust.edu.cn/pepbdb/db/download/pepbdb-20200318.tgz

Scripts to clean and preprocess the dataset for usage with HYDRA is provided in the utils/datasets/ directory and can be used as follows:

python3 utils/datasets/clean_pepbdb.py --source /path/to/pepbdb --dest /path/to/pepbdb --n_atom_thr 200
python3 utils/datasets/process_pepbdb.py --source /path/to/pepbdb --dest /path/to/pepbdb_natoms200_pocket10 --radius 10

Pre-trained Model Checkpoint

You can download the model weights used in the paper here for inference.


Configuration management for HYDRA is done through multiple .yml files located in the configs/ directory.

These configurations might have to be modified to point to the dataset and checkpoint path on your system before proceeding with training and evaluation.


To train HYDRA, run:

python3 scripts/train.py configs/train.yml

Optionally, you may specify the parameter --num_gpus N in order to perform multi-GPU training using the Distributed Data Parallel strategy.
The parameter --ckpt can be used to resume training from an existing checkpoint.


Evaluating HYDRA must be done in two stages:

  1. Sampling residues based on the target receptor.
  2. Reconstructing generated residues into peptides.

1.1 Sampling for all receptors in the testset

python3 scripts/sample-testset.py configs/train.yml configs/sample.yml --out_dir ./outputs

1.2 Sampling for a receptor from PDB

python3 scripts/sample-pdb.py data/pfemp1/PF3D71150400_MEDIUM.pdb configs/train.yml configs/sample.yml --out_dir ./outputs

2. Reconstructing sampled residues into peptides

python3 scripts/reconstruct.py ./outputs configs/reconstruct.yml


All source code within this repository is licensed under the MIT License, please see the LICENSE file for more details.