Home

Awesome

Antibody-SGM

Antibody-SGM schematic

Descriptions

Antibody-SGM is a score-based generative modeling for de novo antibody heavy chain design. This repository contains the codebase for [Antibody-SGM, A score-based generative model for de novo antibody heavy chain design]

Installation

Run the following commands to install AB-SGM and the necessary dependencies; installation should take less than 15 minutes. We recommend using the conda environment supplied in this repository.

  1. Clone repository ` git clone https://github.com/xxiexuezhi/ABSGM.git ' or download the zip files and extract all.
  2. Install conda environment create -f ab_env.yaml
  3. Activate conda environment conda activate ab_env

To run AB-SGM, download and extract the model parameters,Saved weights

The code is tested on Python 3.8.17 and needs the Pytorch GPU version for training and sampling. We are using a GPU for model training, 6D coordinate sampling, and CPU batch processing for Rosetta. More specifically, we used a single NVIDIA V100/A100 for training and inference and at least 2 core CPU/8GB RAM per Rosetta job. Please refer to the shell script inside GitHub for detailed configurations.

Inference (Conditional generation)

All related codes are in the CDR_inpainting_conditional_generations directory. Structures are generated by sampling 6D coordinates from the model and running PyRosetta. The encoded examples (pdb id: 6nmv, 6hga, 1i9r) are uploaded. They can be found from:

The encoded examples (pdb id: 6nmv, 6hga, 1i9r) are uploaded. you could find from

To encode the antigen-antibody complex structures, please refer to the readme inside the CDR_inpainting_conditional_generations/encoding/.

6d and sequence pairs generations

Please use python sampling_6d.py to generate 6d coordinates and sequences. We used the shell scripts for generations (please refer to h3_6d_sample.sh for more details). For instance,

python sampling_6d.py ./configs/inpainting_ch6.yml ../saved_weights/h3_inpaint.pth --pkl proteindataset_example_singlecdr_inpaint_h3_6nmv_6hga_1i9r.pkl --chain A --index 1  --tag singlecdr_inpaint_h3_Jun_2024_fixed_padding


The descriptions of each parameter are as below:

6D coordinate sampling should ~ take 1 minute per sample on a normal GPU. Rosetta minimization should take about 3 hours per iteration, depending on the selected H1, H2, or H3 region size for design.

6d and sequence pairs to pdbs

Please use python 'convert_6d_seq_to_pdb.py' to convert the 6d coordinates and sequence pairs into the pdb using PyRosetta. Please refer to h3_shell_job_6d_to_pdb.sh for more details. For instance,



python convert_6d_seq_to_pdb.py singlecdr_inpaint_h3_Jun_2024_fixed_padding/samples_${SLURM_ARRAY_TASK_ID}.pkl single_hv_example_pdb/${files[${SLURM_ARRAY_TASK_ID}]}  ${SLURM_ARRAY_TASK_ID}  3




Dataset


Raw antigen-antibody complex dataset for CDR conditional generations can be downloaded here

The 'sabdab_downloader.py' is also available for download.

Reference


Antibody-SGM (to be updated)