Home

Awesome

Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation

Implementation for our paper, submitted to NeurIPS 2021 (also check this high-level blog post).

This is a minimum working version of the code used for the paper, which is extracted from the internal repository of the Mila Molecule Discovery project. Original commits are lost here, but the credit for this code goes to @bengioe, @MJ10 and @MKorablyov (see paper).

Note: for more modern implementations of GFlowNet, check out recursionpharma/gflownet, saleml/gfn, and alexhernandezgarcia/gflownet.

Grid experiments

Requirements for base experiments:

Additional requirements for active learning experiments:

Molecule experiments

Additional requirements:

For rdkit in particular we found it to be easier to install through (mini)conda, but rdkit-pypi also works on pip in a vanilla python virtual environment. torch_geometric has non-trivial installation instructions.

If you have CUDA 10.1 configured, you can run pip install -r requirements.txt. You can also change requirements.txt to match your CUDA version. (Replace cu101 to cuXXX, where XXX is your CUDA version).

We compress the 300k molecule dataset for size. To uncompress it, run cd mols/data/; gunzip docked_mols.h5.gz.