Home

Awesome

VillanDiffusion

Code Repo for the NeurIPS 2023 paper "VillanDiffusion: A Unified Backdoor Attack Framework for Diffusion Models"

Environment

Usage

Install Require Packages and Prepare Essential Data

Please run

bash install.sh

Wandb Logging Support

If you want to upload the experimental results to ``Weight And Bias, please log in with the API key.

wandb login --relogin --cloud <API Key>

Prepare Dataset

Training

Evaluation

Pre-Trained Models

I've uploaded all pre-trained backdoor diffusion models for BadDiffusion and VillanDiffusion on HuggingFace. Please feel free to download backdoored diffusion models from it.

Note that for training backdoor score-based models, you need to download the pre-trained clean model from HuggingFace and put it under the working directory.

Backdoor Unconditional Diffusion Models with VillanDiffusion

Arguments

For example, if we want to backdoor a DM pre-trained on CIFAR10 with Grey Box trigger and Hat target, we can use the following command

python VillanDiffusion.py --project default --mode train+measure --dataset CIFAR10 --batch 128 --epoch 50 --poison_rate 0.1 --trigger BOX_14 --target HAT --ckpt DDPM-CIFAR10-32 --fclip o -o --gpu 0

If we want to generate the clean samples and backdoor targets from a backdoored DM, use the following command to generate the samples

python VillanDiffusion.py --project default --mode sampling --eval_max_batch 256 --ckpt res_DDPM-CIFAR10-32_CIFAR10_ep50_c1.0_p0.1_BOX_14-HAT --fclip o --gpu 0

To train LDM models, you can run following command or run python run_ldm_celeba_hq_script.py.

python VillanDiffusion.py --postfix new-set --project default --mode train --dataset CELEBA-HQ-LATENT --dataset_load_mode NONE --sde_type SDE-LDM --learning_rate 0.0002 --sched UNIPC-SCHED --infer_steps 20 --batch 16 --epoch 2000 --clean_rate 1 --poison_rate 0.9 --trigger GLASSES --target CAT --solver_type ode --psi 1 --vp_scale 1.0 --ve_scale 1.0 --ckpt LDM-CELEBA-HQ-256 --fclip o --save_image_epochs 1 --save_model_epochs 1 --result exp_GenBadDiffusion_LDM_BadDiff_ODE -o --gpu 0

To train Score-Based models, you can run following command or run python run_score-basde_model_script.py.

Note that for training backdoor score-based models, you need to download the pre-trained clean model from HuggingFace and put it under the working directory.

python VillanDiffusion.py --postfix flex_new-set --project default --mode train --learning_rate 2e-05 --dataset CIFAR10 --sde_type SDE-VE --batch 128 --epoch 30 --clean_rate 1.0 --poison_rate 0.98 --dataset_load_mode FIXED --trigger STOP_SIGN_14 --target HAT --solver_type sde --psi 0 --vp_scale 1.0 --ve_scale 1.0 --ckpt NCSN_CIFAR10_my --fclip o --save_image_epochs 5 --save_model_epochs 5 --result exp_GenBadDiffusion_NCSNPP_CIFAR10_TrojDiff_SDE_FLEX -o --R_trigger_only --gpu 0

To measure with inpainting task, you can run following instruction or run python run_measure_inpaint.py

python VillanDiffusion.py --project default --mode measure --task poisoned_denoise --sched UNIPC-SCHED --infer_steps 20 --infer_start 10 --ckpt /work/u2941379/workspace/backdoor_diffusion/res_DDPM-CIFAR10-32_CIFAR10_ep100_ode_c1.0_p0.2_SM_STOP_SIGN-BOX_psi1.0_lr0.0002_vp1.0_ve1.0_new-set-1_test --fclip o --gpu 0

Backdoor Conditional Diffusion Models with VillanDiffusion

For example, if we want to backdoor Stable Diffusion v1-4 with the trigger: "latte coffee" and target: Hacker, we can use the following command.

python viallanDiffusion_conditional.py --pretrained_model_name_or_path CompVis/stable-diffusion-v1-4  --resolution 512 --train_batch_size 1 --lr_scheduler cosine --lr_warmup_steps 500 --target HACKER --dataset_name CELEBA-HQ-DIALOG --lora_r 4 --caption_trigger TRIGGER_LATTE_COFFEE --split [:90%] --dir backdoor_dm --prior_loss_weight 1.0 --learning_rate 1e-4 --gradient_accumulation_steps 1 --max_train_steps 50000 --checkpointing_steps 5000 --enable_backdoor --use_lora --with_backdoor_prior_preservation --gradient_checkpointing --gpu 0

Generate Samples

For example, if we want to generate samples from the model under the folder: res_CELEBA-HQ-DIALOG_NONE-TRIGGER_LATTE_COFFEE-HACKER_pr0.0_ca0_caw1.0_rctp0_lr0.0001_step50000_prior1.0_lora4, we can use the following command.

python sampling.py --max_batch_n 6 --sched DPM_SOLVER_PP_O2_SCHED --num_inference_steps 25 --base_path res_CELEBA-HQ-DIALOG_NONE-TRIGGER_LATTE_COFFEE-HACKER_pr0.0_ca0_caw1.0_rctp0_lr0.0001_step50000_prior1.0_lora4 --ckpt_step -1 --gpu 0
<!-- ### Clean Loss: $$ || \epsilon_{t} + \sigma_{t} \cdot \epsilon_{\theta}(x_{0} + \sigma_t \epsilon_{t}) ||_2 $$ ### Backdoor Loss: $$ || (\epsilon_{t} + \mathcal{R}_{t} \mathbf{r}) + \sigma_{t} \cdot \epsilon_{\theta}(\mathbf{y} + \sigma_t \epsilon_{t} + \mathcal{S}_{t} \mathbf{r}) ||_2 $$ $\mathcal{R}_{t}$ is R_coef, $\mathbf{y}$ is target image , and $\mathcal{S}_{t}$ is step -->