Home

Awesome

CardioTox net: A robust predictor for hERG channel blockade via deep learning meta ensembling approaches

Abdul Karim, Matthew Lee, Thomas Balle, and Abdul Sattar

This is complementary code for running the models in the paper submitted to BMC Cheminformatics dated 4th January, 2021.

Installation

Tested on Ubuntu 20.04 with Python 3.7.7

  1. Install conda dependency manager https://docs.conda.io/en/latest/
  2. Restore environment.yml:
conda env create -f environment.yml 
  1. Activate environment:
conda activate cardiotox
  1. Install pyBioMed:
cd PyBioMed
python setup.py install
cd ..
  1. Test model:
python test.py

This will test the model on two external data sets mentioned in the paper.

Usage

Run Ensemble

Single SMILE String

import cardiotox

smile = "CC(=O)SC1CC2=CC(=O)CCC2(C)C2CCC3C(CCC34CCC(=O)O4)C12"

model = cardiotox.load_ensemble()

model.predict(smile)

Multiple SMILE Strings

import cardiotox

smiles = [
    "CC(=O)SC1CC2=CC(=O)CCC2(C)C2CCC3C(CCC34CCC(=O)O4)C12",
    "CCCCCCCCCC[N+](CC)(CC)CC"
]

model = cardiotox.load_ensemble()

model.predict(smiles)

Run Individual Models

Import the model you want

from cardiotox import DescModel, SVModel, FVModel,  FingerprintModel

Run the model the same way as ensemble

from cardiotox import SVModel

smile = "CCCCCCCCCC[N+](CC)(CC)CC"

model = SVModel()

model.predict(smile)

Run Preprocessing

Each model performs its own preprocessing. When 'predict' is called, the preprocessing is performed before running the model. This can be accessed by calling the 'preprocess_smile' function.

from cardiotox import SVModel

smile = "CCCCCCCCCC[N+](CC)(CC)CC"

model = SVModel()

preprocessed_smile = model.preprocess_smile([smile]) # Expects a list of smiles

model.predict_preprocessed(preprocessed_smile)

Pairwise Tanimoto similarity

We make sure that none of the molecule in both test sets (test set-I, test set-II) are similar to trainining set (training) and to each other as well.

Pairwise Tanimoto similarity bins

Results

We compared our method using the test set-I and test set-II with other state of the art methods as follows.

<table> <tr><th>Test set-I </th><th>Test set-II </th></tr> <tr><td>
MethodsMCCNPVACCPPVSPESEN
CardioTox0.5990.6880.8100.8930.7860.833
DeepHIT0.4760.6430.7730.8330.6430.833
CardPred0.1930.6430.6140.7600.5710.633
OCHEM Predictor-I0.1490.3330.3641.0001.0000.067
OCHEM Predictor-II0.1640.3510.4320.8570.9290.200
Pred-hERG 4.20.3060.5380.7050.7740.5000.800
</td><td>
MethodsMCCNPVACCPPVSPESEN
CardioTox0.4690.9470.7580.4780.6000.917
DeepHIT0.3980.9410.7210.4170.5330.909
CardPred0.0490.7500.5270.2940.6000.454
OCHEM Predictor-I0.3720.8000.6480.6660.9330.364
OCHEM Predictor-II0.3100.7940.6320.5710.9000.364
Pred-hERG 4.20.1460.8130.5800.3200.4330.727
</td></tr> </table>

Note: Only suitable for SMILES with Maximum number of 1's in MorganFingerprint <= 93.