Home

Awesome

Please note: this repository is no longer being maintained.

GraphINVENT

cover image

Description

GraphINVENT is a platform for graph-based molecular generation using graph neural networks. GraphINVENT uses a tiered deep neural network architecture to probabilistically generate new molecules a single bond at a time. All models implemented in GraphINVENT can quickly learn to build molecules resembling training set molecules without any explicit programming of chemical rules. The models have been benchmarked using the MOSES distribution-based metrics, showing how the best GraphINVENT model compares well with state-of-the-art generative models.

Updates

The following versions of GraphINVENT exist in this repository:

20-08-2021:

Large update:

26-03-2021:

Small update:

10-03-2021:

The biggest changes in v2.0 from v1.0 are summarized below:

Additionally, minor typos and bugs were corrected, and the docstrings and error messages updated. Examples of minor bugs/changes:

If you spot any issues (big or small) since the update, please create an issue or a pull request (if you are able to fix it), and we will be happy to make changes.

Prerequisites

Instructions and tutorials

For detailed guides on how to use GraphINVENT, see the tutorials.

Examples

An example training set is available in ./data/gdb13_1K/. It is a small (1K) subset of GDB-13 and is already preprocessed.

Contributors

@rociomer

@rastemo

@edvardlindelof

@sararromeo

@JuanViguera

@psolsson

Contributions

Contributions are welcome in the form of issues or pull requests. To report a bug, please submit an issue. Thank you to everyone who has used the code and provided feedback thus far.

References

Relevant publications

If you use GraphINVENT in your research, please reference our publication.

Additional details related to the development of GraphINVENT are available in our technical note. You might find this note useful if you're interested in either exploring different hyperparameters or developing your own generative models.

The references in BibTex format are available below:

@article{mercado2020graph,
  author = "Rocío Mercado and Tobias Rastemo and Edvard Lindelöf and Günter Klambauer and Ola Engkvist and Hongming Chen and Esben Jannik Bjerrum",
  title = "{Graph Networks for Molecular Design}",
  journal = {Machine Learning: Science and Technology},
  year = {2020},
  publisher = {IOP Publishing},
  doi = "10.1088/2632-2153/abcf91"
}

@article{mercado2020practical,
  author = "Rocío Mercado and Tobias Rastemo and Edvard Lindelöf and Günter Klambauer and Ola Engkvist and Hongming Chen and Esben Jannik Bjerrum",
  title = "{Practical Notes on Building Molecular Graph Generative Models}",
  journal = {Applied AI Letters},
  year = {2020},
  publisher = {Wiley Online Library},
  doi = "10.1002/ail2.18"
}

Related work

MPNNs

The MPNN implementations used in this work were pulled from Edvard Lindelöf's repo in October 2018, while he was a masters student in the MAI group. This work is available at

https://github.com/edvardlindelof/graph-neural-networks-for-drug-discovery.

His master's thesis, describing the EMN implementation, can be found at

https://odr.chalmers.se/handle/20.500.12380/256629.

MOSES

The MOSES repo is available at https://github.com/molecularsets/moses.

GDB-13

The example dataset provided is a subset of GDB-13. This was obtained by randomly sampling 1000 structures from the entire GDB-13 dataset. The full dataset is available for download at http://gdb.unibe.ch/downloads/.

RL-GraphINVENT

Version 3.0 incorporates Sara's work into the latest GraphINVENT framework: repo and paper. Her work was presented at the RL4RealLife workshop at ICML 2021.

Exploring graph traversal algorithms in GraphINVENT

In this pre-print, we look into the effect of different graph traversal algorithms on the types of structures that are generated by GraphINVENT. We find that a BFS generally leads to better molecules than a DFS, unless the model is overtrained, at which point both graph traversal algorithms lead to indistinguishible sets of structures.

License

GraphINVENT is licensed under the MIT license and is free and provided as-is.

Link

https://github.com/MolecularAI/GraphINVENT/