Home

Awesome

A Group Symmetric Stochastic Differential Equation Model for Molecule Multi-modal Pretraining

ICML 2023

Shengchao Liu<sup>+</sup>, Weitao Du<sup>+</sup>, Zhiming Ma, Hongyu Guo, Jian Tang

<sup>+</sup> Equal contribution

[Project Page] [Paper] [ArXiv] [Checkpoints on HuggingFace]

<p align="center"> <img src="figure/pipeline.png" /> </p>

All the pretrained checkpoints are available on this HuggingFace link. You can find detailed mapping between checkpoints and tables in file README_checkpoints.md.

<p align="left"> <img src="figure/demo.gif" width="100%" /> </p>

Environments

conda create -n Geom3D python=3.7
conda activate Geom3D
conda install -y -c rdkit rdkit
conda install -y numpy networkx scikit-learn
conda install -y -c conda-forge -c pytorch pytorch=1.9.1
conda install -y -c pyg -c conda-forge pyg=2.0.2
pip install ogb==1.2.1

pip install sympy

pip install ase  # for SchNet

pip intall -e .

Datasets

Pretraining

A quick demo on pretraining is:

cd examples

python pretrain_MoleculeSDE.py \
--verbose --input_data_dir=../data --dataset=PCQM4Mv2 \
--model_3d=SchNet \
--lr=1e-4 --epochs=50 --num_workers=0 --batch_size=256 --SSL_masking_ratio=0 --gnn_3d_lr_scale=0.1 --dropout_ratio=0 --graph_pooling=mean --emb_dim=300 --epochs=1 \
--SDE_coeff_contrastive=1 --CL_similarity_metric=EBM_node_dot_prod --T=0.1 --normalize --SDE_coeff_contrastive_skip_epochs=0 \
--SDE_coeff_generative_2Dto3D=1 --SDE_2Dto3D_model=SDEModel2Dto3D_02 --SDE_type_2Dto3D=VE --use_extend_graph \
--SDE_coeff_generative_3Dto2D=1 --SDE_3Dto2D_model=SDEModel3Dto2D_node_adj_dense --SDE_type_3Dto2D=VE --noise_on_one_hot \
--output_model_dir=[MODEL_DIR]

Notice that the [MODEL_DIR] is where you are going to save your models/checkpoints.

Downstream

The downstream scripts can be found under the examples folder. Below we illustrate few simple examples.

Cite Us

Feel free to cite this work if you find it useful to you!

@inproceedings{liu2023group,
  title={A group symmetric stochastic differential equation model for molecule multi-modal pretraining},
  author={Liu, Shengchao and Du, Weitao and Ma, Zhi-Ming and Guo, Hongyu and Tang, Jian},
  booktitle={International Conference on Machine Learning},
  pages={21497--21526},
  year={2023},
  organization={PMLR}
}