Home

Awesome

In this repository, we implement the Graph Diffusion via the System of SDEs (GDSS) using Graph Transformer.

Paper: Score-based Generative Modeling of Graphs via the System of Stochastic Differential Equations (ICML 2022).

Original Code Repository: https://github.com/harryjo97/GDSS

Dependencies

Please create an environment with Python 3.9.15 and Pytorch 1.12.1, and run the following command to install the requirements:

pip install -r requirements.txt
conda install pyg -c pyg
conda install -c conda-forge graph-tool=2.45
conda install -c conda-forge rdkit=2022.03.2

Running Experiments

1. Preparations

We provide four general graph datasets (Planar and SBM) and two molecular graph datasets (QM9 and ZINC250k).

Download the datasets from the following links and <u>move the dataset to data directory</u>:

We provide the commands for generating general graph datasets as follows:

python data/data_generators.py --dataset <dataset> --mmd

where <dataset> is one of the general graph datasets: planar and sbm. This will create the <dataset>.pkl file in the data directory.

To preprocess the molecular graph datasets for training models, run the following command:

python data/preprocess.py --dataset ${dataset_name}
python data/preprocess_for_nspdk.py --dataset ${dataset_name}

For the evaluation of generic graph generation tasks, run the following command to compile the ORCA program (see http://www.biolab.si/supp/orca/orca.html):

cd evaluation/orca 
g++ -O2 -std=c++11 -o orca orca.cpp

2. Training

We provide the commands for the following tasks: Generic Graph Generation and Molecule Generation.

To train the score models, first modify config/${dataset}.yaml accordingly, then run the following command.

CUDA_VISIBLE_DEVICES=${gpu_ids} python main.py --type train --config ${train_config} --seed ${seed}

for example,

CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py --type train --config planar --seed 42

and

CUDA_VISIBLE_DEVICES=0 python main.py --type train --config qm9 --seed 42

3. Generation and Evaluation

To generate graphs using the trained score models, run the following command.

CUDA_VISIBLE_DEVICES=${gpu_ids} python main.py --type sample --config planar

or

CUDA_VISIBLE_DEVICES=${gpu_ids} python main.py --type sample --config sample_qm9

Pretrained checkpoints

We provide checkpoints of the pretrained models in the follwoing links:

Citation

@article{jo2022GDSS,
  author    = {Jaehyeong Jo and
               Seul Lee and
               Sung Ju Hwang},
  title     = {Score-based Generative Modeling of Graphs via the System of Stochastic
               Differential Equations},
  journal   = {arXiv:2202.02514},
  year      = {2022},
  url       = {https://arxiv.org/abs/2202.02514}
}