Home

Awesome

Resources:

Source codes:

Step-by-step running:

0. Install Python libraries needed

conda create -n geometric python=3
conda activate geometric
conda install -y -c conda-forge rdkit
conda install pytorch torchvision cudatoolkit -c pytorch
pip install torch-scatter==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html
pip install torch-sparse==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html
pip install torch-cluster==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html
pip install torch-spline-conv==latest+cu101 -f https://pytorch-geometric.com/whl/torch-1.4.0.html
pip install torch-geometric

1. Create data in pytorch format

Running

conda activate geometric
python create_data.py

This returns kiba_train.csv, kiba_test.csv, davis_train.csv, and davis_test.csv, saved in data/ folder. These files are in turn input to create data in pytorch format, stored at data/processed/, consisting of kiba_train.pt, kiba_test.pt, davis_train.pt, and davis_test.pt.

2. Train a prediction model

To train a model using training data. The model is chosen if it gains the best MSE for testing data.
Running

conda activate geometric
python training.py 0 0 0

where the first argument is for the index of the datasets, 0/1 for 'davis' or 'kiba', respectively; the second argument is for the index of the models, 0/1/2/3 for GINConvNet, GATNet, GAT_GCN, or GCNNet, respectively; and the third argument is for the index of the cuda, 0/1 for 'cuda:0' or 'cuda:1', respectively. Note that your actual CUDA name may vary from these, so please change the following code accordingly:

cuda_name = "cuda:0"
if len(sys.argv)>3:
    cuda_name = "cuda:" + str(int(sys.argv[3])) 

This returns the model and result files for the modelling achieving the best MSE for testing data throughout the training. For example, it returns two files model_GATNet_davis.model and result_GATNet_davis.csv when running GATNet on Davis data.

3. Train a prediction model with validation

In "3. Train a prediction model", a model is trained on training data and chosen when it gains the best MSE for testing data. This follows how a model was chosen in https://github.com/hkmztrk/DeepDTA. The result by two ways of training is comparable though.

In this section, a model is trained on 80% of training data and chosen if it gains the best MSE for validation data, which is 20% of training data. Then the model is used to predict affinity for testing data.

Same arguments as in "3. Train a prediction model" are used. E.g., running

python training_validation.py 0 0 0

This returns the model achieving the best MSE for validation data throughout the training and performance results of the model on testing data. For example, it returns two files model_GATNet_davis.model and result_GATNet_davis.csv when running GATNet on Davis data.