Awesome
FoldMark: Protecting Protein Generative Models with Watermarking
<div align=center><img src="https://github.com/zaixizhang/FoldMark/blob/main/assets/foldmark.png" width="202"/></div>In the github repo, we apply FoldMark to FrameFlow as an example.
Installation
# Conda environment with dependencies.
conda env create -f foldmark.yml
# Activate environment
conda activate fm
# Manually need to install torch-scatter.
pip install torch-scatter -f https://data.pyg.org/whl/torch-2.0.0+cu117.html
# Install local package.
# Current directory should be FoldMark/
pip install -e .
Wandb
Our training relies on logging with wandb. Log in to Wandb and make an account. Authorize Wandb here.
Data
Download preprocessed SCOPe dataset (~280MB) hosted on dropbox: link.
Other datasets are also possible to train on using the data/process_pdb_files.py
script.
However, we currently do not support other datasets.
# Expand tar file.
tar -xvzf preprocessed_scope.tar.gz
rm preprocessed_scope.tar.gz
Your directory should now look like this
├── analysis
├── build
├── configs
├── data
├── experiments
├── media
├── models
├── openfold
├── preprocessed
└── weights
Pretrain
python -W ignore experiments/pretrain.py
Pretrain
python -W ignore experiments/finetune.py
Acknowledgements
We thank the opensource codes from WaDiff , AquaLoRA and openfold .
Reference
@article{zhang2024foldmark,
title={FoldMark: Protecting Protein Generative Models with Watermarking},
author={Zhang, Zaixi and Jin, Ruofan and Fu, Kaidi and Cong, Le and Zitnik, Marinka and Wang, Mengdi},
journal={bioRxiv},
pages={2024--10},
year={2024},
publisher={Cold Spring Harbor Laboratory}
}