

Proximal Exploration (PEX)

This repository contains a PyTorch implementation of our paper Proximal Exploration for Model-guided Protein Sequence Design published at ICML 2022. Proximal Exploration (PEX) is a variant of directed evolution, which prioritizes the search for low-order mutants. Based this local-search mechanism, a model architecture called Mutation Factorization Network (MuFacNet) is developed to specialize in the local fitness landscape around the wild type.


The dependencies can be set up using the following commands:

conda create -n pex python=3.8 -y
conda activate pex
conda install pytorch=1.10.2 cudatoolkit=11.3 -c pytorch -y
conda install numpy=1.19 pandas=1.3 -y
conda install -c conda-forge tape_proteins=0.5 -y
pip install sequence-models==1.2.0

Clone this repository and download the oracle landscape models by the following commands:

git clone https://github.com/HeliXonProtein/proximal-exploration.git
cd proximal-exploration
bash download_landscape.sh


Run the following commands to reproduce our main results shown in section 5.1. There are eight fitness landscapes to support a diverse evaluation on black-box protein sequence design.

python run.py --alg=pex --net=mufacnet --task=avGFP  # Green Fluorescent Proteins
python run.py --alg=pex --net=mufacnet --task=AAV    # Adeno-associated Viruses
python run.py --alg=pex --net=mufacnet --task=TEM    # TEM-1 β-Lactamase
python run.py --alg=pex --net=mufacnet --task=E4B    # Ubiquitination Factor Ube4b
python run.py --alg=pex --net=mufacnet --task=AMIE   # Aliphatic Amide Hydrolase
python run.py --alg=pex --net=mufacnet --task=LGK    # Levoglucosan Kinase
python run.py --alg=pex --net=mufacnet --task=Pab1   # Poly(A)-binding Protein
python run.py --alg=pex --net=mufacnet --task=UBE2I  # SUMO E2 conjugase

In the default configuration, the protein fitness landscape is simulated by a TAPE-based oracle model. By adding the argument --oracle_model=esm1b, the landscape simulator is switched to an oracle model based on ESM-1b.


Please contact zhizhour[at]helixon.com for any questions related to the source code.