Awesome
ESMIFDesign
This repository focuses on designing T-cell receptors (TCRs) using the ESM-IF1 deep learning method.
The ESM-IF1 inverse folding method is built for predicting protein sequences from their backbone atom coordinates. Here, we use the ESM-IF1 model (esm_if1_gvp4_t16_142M_UR50 - fair-esm v2.0.1) to design part of the TCRs sequences, considering their peptide-major histocompatibility complex (pMHC) complex. Positions outside the specified criteria were held constant. The designed positions were changed to <mask> tokens, and all amino acid substitutions, including cysteine, were allowed. Given the multi-chain structure, a padding of 10 <pad> tokens was used to separate the chains.
Dependencies
To install the dependencies, run:
pip install -r requirements.txt
Usage
The configuration file is located at config.json
, where it specifies, for each structure, the residues and chains to be designed. The file is organized as follows:
{
"6zkw": ["110D","111D","112D","134D","135D","113E","114E","133E"],
...,
"8shi": ["109D","110D","111D","112D","113D","114D","135D","110E","111E","112E","113E","134E","135E"]
}
To design TCR sequences, run:
python run.py
Testing
We tested some conditions to check the performance of the model.
- Testing on
6ZKW
structure:6ZKW.pdb
represents a TCR-pMHC complex.6ZKW_DE.pdb
includes only the TCR (D and E chains).6ZKW_contact_to_gly.pdb
is a TCR-pMHC complex with mutations on the TCR residues in contact with the pMHC, changing them to glycine.6ZKW_all_gly.pdb
is a TCR-pMHC complex with mutations on all residues, changing them to glycine.
- Testing on pMHC interference on protein sequence design (
tests/pMHC1.config
) - Testing temperature in the interval (
tests/temperature.config
): [1e-6, 1e-5, 1e-4, 1e-3, 1e-2, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 2, 5] - Testing number of samples in the interval (
tests/sampling.config
): [5, 10, 25, 50, 100, 250, 500] - Testing design approaches:
CDR3 interface
(tests/CDR3_interface.config
): restricting the design to CDR3 (α and β TCR chains) within a proximity of 5 Å to either the peptide or MHC;CDR3
(tests/CDR3.config
): designing the entire CDR3 (α and β TCR chains).CDRs interface
(tests/CDRs_interface.config
): designing CDR1, CDR2, or CDR3 positions (α and β TCR chains) within a 5 Å distance to the peptide or MHC;
To run the tests, navigate to the tests directory and execute:
cd tests
python testing.py