Home

Awesome

Medchem design moves for generating new compounds

This repository contains a python library for exploring the medicinal chemistry design moves, as reported in publication: "The playbook of Medicinal Chemistry Design Moves, xxx-xxx-xxx" The library is adpated from original mmpdb code

Dependency

Dataset

How to run programm

python mmpdb mmpCompoundGenerator  --tsmiles "O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)N" --transformation_db "chemblDB3.sqlitdb" --replaceGroup "*S(=O)(=O)(N)" --tradius 3 --toutput output.txt --tmin-pairs 100

Output

original_smitransformed_smioriginal_fragnew_fragrule_freqex_lhs_cpd_idex_rhs_cpd_id
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCc1ccc(-c2cc(C(F)(F)F)nn2-c2ccccc2)cc1[*]S(N)(=O)=O[*:1][H]1103CHEMBL51385CHEMBL1201104
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCOc1ccc(-n2nc(C(F)(F)F)cc2-c2ccc(C)cc2)cc1[*]S(N)(=O)=O[*:1]OC607CHEMBL51385CHEMBL8441
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(Cl)cc2)cc1[*]S(N)(=O)=O[*:1]Cl523CHEMBL51385CHEMBL462
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(S(C)(=O)=O)cc2)cc1[*]S(N)(=O)=O[*:1]S(C)(=O)=O497CHEMBL468367CHEMBL507789
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(F)cc2)cc1[*]S(N)(=O)=O[*:1]F491CHEMBL3426428CHEMBL3426432
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(C)cc2)cc1[*]S(N)(=O)=O[*:1]C408CHEMBL51385CHEMBL274877
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(C(=O)O)cc2)cc1[*]S(N)(=O)=O[*:1]C(=O)O275CHEMBL1966874CHEMBL2094690
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(N+[O-])cc2)cc1[*]S(N)(=O)=O[*:1]N+[O-]265CHEMBL51385CHEMBL8682
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(O)cc2)cc1[*]S(N)(=O)=O[*:1]O255CHEMBL3632832CHEMBL1341020
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(C(N)=O)cc2)cc1[*]S(N)(=O)=O[*:1]C(N)=O215CHEMBL3901141CHEMBL3973258
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(Br)cc2)cc1[*]S(N)(=O)=O[*:1]Br205CHEMBL51385CHEMBL450762
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(C#N)cc2)cc1[*]S(N)(=O)=O[*:1]C#N193CHEMBL3426428CHEMBL3426430
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCC(=O)Nc1ccc(-n2nc(C(F)(F)F)cc2-c2ccc(C)cc2)cc1[*]S(N)(=O)=O[*:1]NC(C)=O170CHEMBL1490019CHEMBL1716793
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCC(=O)c1ccc(-n2nc(C(F)(F)F)cc2-c2ccc(C)cc2)cc1[*]S(N)(=O)=O[*:1]C(C)=O152CHEMBL2163818CHEMBL1206418
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCCOc1ccc(-n2nc(C(F)(F)F)cc2-c2ccc(C)cc2)cc1[*]S(N)(=O)=O[*:1]OCC141CHEMBL2163818CHEMBL1206420
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(C(F)(F)F)cc2)cc1[*]S(N)(=O)=O[*:1]C(F)(F)F137CHEMBL51385CHEMBL501107
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCOC(=O)c1ccc(-n2nc(C(F)(F)F)cc2-c2ccc(C)cc2)cc1[*]S(N)(=O)=O[*:1]C(=O)OC136CHEMBL3695775CHEMBL3695774
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCCOC(=O)c1ccc(-n2nc(C(F)(F)F)cc2-c2ccc(C)cc2)cc1[*]S(N)(=O)=O[*:1]C(=O)OCC132CHEMBL285831CHEMBL241971
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(N)cc2)cc1[*]S(N)(=O)=O[*:1]N126CHEMBL51385CHEMBL463
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(N3CCOCC3)cc2)cc1[*]S(N)(=O)=O[*:1]N1CCOCC1114CHEMBL3679534CHEMBL3679542
O=S(=O)(c3ccc(n1nc(cc1c2ccc(cc2)C)C(F)(F)F)cc3)NCc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(N(C)C)cc2)cc1[*]S(N)(=O)=O[*:1]N(C)C108CHEMBL3695775CHEMBL3695770

Input Options

parametermeaning
--transformation_dbtransformation database file
--tsmilesinput query molecule (in smiles format)
--replaceGroupfragment to be replace in a query molecule (in smiles format). Note that it must contains * or [*] to indicate attachment point. 1) single cut example *N1CCOCC1, 2) double cut example N1CCOC()C1
--tradiuschemical envionment radius for design move. It must equal to radius of the transformation database. So in this case 3.
--toutputname of output text file
--tmin-pairsconsider design moves that have at-least tmin-pairs examples. This is in other words the freqency of a design move. For instacne if we set tmin-pairs to 5: it say that consider all design moves that are derived from at-least five MMP pairs.

Output Columns Explanation

columnmeaning
original_smismiles of a query molecule
transformed_smia new design molecule
constant_smismiles of constant part i.e. part which did not change
original_fragfragment in a query to replace
new_fragnew suggested fragment
envsmichemical environment of replacementi (design move)
rule_freqfrequency of design move (transformation)
ex_lhs_cpd_idan example of MMP (left hand compound id) from ChEMBL
ex_rhs_cpd_idan example of MMP (right hand compound id) from ChEMBL