Awesome

<p align="center"><img src="figs/logo_deeppurpose_horizontal.png" alt="logo" width="400px" /></p> <h3 align="center"> <p> A Deep Learning Library for Compound and Protein Modeling <br>DTI, Drug Property, PPI, DDI, Protein Function Prediction<br></h3> <h4 align="center"> <p> Applications in Drug Repurposing, Virtual Screening, QSAR, Side Effect Prediction and More </h4>

This repository hosts DeepPurpose, a Deep Learning Based Molecular Modeling and Prediction Toolkit on Drug-Target Interaction Prediction, Compound Property Prediction, Protein-Protein Interaction Prediction, and Protein Function prediction (using PyTorch). We focus on DTI and its applications in Drug Repurposing and Virtual Screening, but support various other molecular encoding tasks. It allows very easy usage (several lines of codes only) to facilitate deep learning for life science research.

News!

[05/21] 0.1.2 Support 5 new graph neural network based models for compound encoding (DGL_GCN, DGL_NeuralFP, DGL_GIN_AttrMasking, DGL_GIN_ContextPred, DGL_AttentiveFP), implemented using DGL Life Science! An example is provided here!
[12/20] DeepPurpose is now supported by TDC data loader, which contains a large collection of ML for therapeutics datasets, including many drug property, DTI datasets. Here is a tutorial!
[12/20] DeepPurpose can now be installed via pip!
[11/20] DeepPurpose is published in Bioinformatics!
[11/20] Added 5 more pretrained models on BindingDB IC50 Units (around 1Million data points).
[10/20] Google Colab Installation Instructions are provided here. Thanks to @hima111997 !
[10/20] Using DeepPurpose, we made a humans-in-the-loop molecular design web UI interface, check it out! [Website, paper]
[09/20] DeepPurpose has now supported three more tasks: DDI, PPI and Protein Function Prediction! You can simply call from DeepPurpose import DDI/PPI/ProteinPred to use, checkout examples below!
[07/20] A simple web UI for DTI prediction can be created under 10 lines using Gradio! A demo is provided here.
[07/20] A blog is posted on the Towards Data Science Medium column, check this out!
[07/20] Two tutorials are online to go through DeepPurpose's framework to do drug-target interaction prediction and drug property prediction (DTI, Drug Property).
[05/20] Support drug property prediction for screening data that does not have target proteins such as bacteria! An example using RDKit2D with DNN for training and repurposing for pseudomonas aeruginosa (MIT AI Cures's open task) is provided as a demo.
[05/20] Now supports hyperparameter tuning via Bayesian Optimization through the Ax platform! A demo is provided in here.

Features

15+ powerful encodings for drugs and proteins, ranging from deep neural network on classic cheminformatics fingerprints, CNN, transformers to message passing graph neural network, with 50+ combined models! Most of the combinations of the encodings are not yet in existing works. All of these under 10 lines but with lots of flexibility! Switching encoding is as simple as changing the encoding names!
Realistic and user-friendly design:
- support DTI, DDI, PPI, molecular property prediction, protein function predictions!
- automatic identification to do drug target binding affinity (regression) or drug target interaction prediction (binary) task.
- support cold target, cold drug settings for robust model evaluations and support single-target high throughput sequencing assay data setup.
- many dataset loading/downloading/unzipping scripts to ease the tedious preprocessing, including antiviral, COVID19 targets, BindingDB, DAVIS, KIBA, ...
- many pretrained checkpoints.
- easy monitoring of training process with detailed training metrics output such as test set figures (AUCs) and tables, also support early stopping.
- detailed output records such as rank list for repurposing result.
- various evaluation metrics: ROC-AUC, PR-AUC, F1 for binary task, MSE, R-squared, Concordance Index for regression task.
- label unit conversion for skewed label distribution such as Kd.
- time reference for computational expensive encoding.
- PyTorch based, support CPU, GPU, Multi-GPUs.

NOTE: We are actively looking for constructive advices/user feedbacks/experiences on using DeepPurpose! Please open an issue or contact us.

Cite Us

If you found this package useful, please cite our paper:

@article{huang2020deeppurpose,
  title={DeepPurpose: A Deep Learning Library for Drug-Target Interaction Prediction},
  author={Huang, Kexin and Fu, Tianfan and Glass, Lucas M and Zitnik, Marinka and Xiao, Cao and Sun, Jimeng},
  journal={Bioinformatics},
  year={2020}
}

Installation

Try it on Binder! Binder is a cloud Jupyter Notebook interface that will install our environment dependency for you.

Video tutorial to install Binder.

We recommend to install it locally since Binder needs to be refreshed every time launching. To install locally, we recommend to install from pip:

`pip`

conda create -n DeepPurpose python=3.6
conda activate DeepPurpose
conda install -c conda-forge notebook
pip install git+https://github.com/bp-kelley/descriptastorus 
pip install DeepPurpose

Build from Source

First time:

git clone https://github.com/kexinhuang12345/DeepPurpose.git ## Download code repository
cd DeepPurpose ## Change directory to DeepPurpose
conda env create -f environment.yml  ## Build virtual environment with all packages installed using conda
conda activate DeepPurpose ## Activate conda environment (use "source activate DeepPurpose" for anaconda 4.4 or earlier) 
jupyter notebook ## open the jupyter notebook with the conda env

## run our code, e.g. click a file in the DEMO folder
... ...

conda deactivate ## when done, exit conda environment

In the future:

cd DeepPurpose ## Change directory to DeepPurpose
conda activate DeepPurpose ## Activate conda environment
jupyter notebook ## open the jupyter notebook with the conda env

## run our code, e.g. click a file in the DEMO folder
... ...

conda deactivate ## when done, exit conda environment

Video tutorial to install locally from source.

Example

Case Study 1(a): A Framework for Drug Target Interaction Prediction, with less than 10 lines of codes.

In addition to the DTI prediction, we also provide repurpose and virtual screening functions to rapidly generation predictions.