Home

Awesome

[CVPR 2024] MACE: Mass Concept Erasure in Diffusion Models

<!-- ## [<a href="https://shilin-lu.github.io/tf-icon.github.io/" target="_blank">Project Page</a>] [<a href="https://entuedu-my.sharepoint.com/:b:/g/personal/shilin002_e_ntu_edu_sg/EWRDLuFDrs5Ll0KGuMtvtbUBhBZcSw2roKCo96iCWgpMZQ?e=rEv3As" target="_blank">Poster</a>] -->

arXiv

Official implementation of MACE: Mass Concept Erasure in Diffusion Models.

MACE: Mass Concept Erasure in Diffusion Models<br>

<!-- > [Gwanghyun Kim](https://gwang-kim.github.io/), Taesung Kwon, [Jong Chul Ye](https://bispl.weebly.com/professor.html) <br> -->

Shilin Lu, Zilan Wang, Leyang Li, Yanzhu Liu, Adams Wai-Kin Kong <br> CVPR 2024

Abstract: <br> The rapid expansion of large-scale text-to-image diffusion models has raised growing concerns regarding their potential misuse in creating harmful or misleading content. In this paper, we introduce MACE, a finetuning framework for the task of mass concept erasure. This task aims to prevent models from generating images that embody unwanted concepts when prompted. Existing concept erasure methods are typically restricted to handling fewer than five concepts simultaneously and struggle to find a balance between erasing concept synonyms (generality) and maintaining unrelated concepts (specificity). In contrast, MACE differs by successfully scaling the erasure scope up to 100 concepts and by achieving an effective balance between generality and specificity. This is achieved by leveraging closed-form cross-attention refinement along with LoRA finetuning, collectively eliminating the information of undesirable concepts. Furthermore, MACE integrates multiple LoRAs without mutual interference. We conduct extensive evaluations of MACE against prior methods across four different tasks: object erasure, celebrity erasure, explicit content erasure, and artistic style erasure. Our results reveal that MACE surpasses prior methods in all evaluated tasks.

teaser


</div>

framework

(a) Our framework focuses on tuning the prompts-related projection matrices within cross-attention (CA) blocks. (b) The pretrained U-Net's CA blocks are refined using a closed-form solution, discouraging the model from embedding the residual information of the target phrase into surrounding words. (c) For each concept targeted for removal, a distinct LoRA module is learned to eliminate its intrinsic information. (d) A closed-form solution is introduced to integrate multiple LoRA modules without interfering with one another while averting catastrophic forgetting.

<!-- # Updates: **19/06/23** 🧨 Diffusers implementation of Plug-and-Play is available [here](https://github.com/MichalGeyer/pnp-diffusers). --> <!-- ## TODO: - [ ] Diffusers support and pipeline integration - [ ] Gradio demo - [ ] Release TF-ICON Test Benchmark --> <!-- ## Usage **To plug-and-play diffusion features, please follow these steps:** 1. [Setup](#setup) 2. [Feature extraction](#feature-extraction) 3. [Running PnP](#running-pnp) 4. [TI2I Benchmarks](#ti2i-benchmarks) -->
</div>

Contents

<br>

Setup

Creating a Conda Environment

git clone https://github.com/Shilin-LU/MACE.git
conda create -n mace python=3.10
conda activate mace
conda install pytorch==2.0.1 torchvision==0.15.2 pytorch-cuda=11.7 -c pytorch -c nvidia

Install Grounded-SAM (Official Version) to Prepare Masks for LoRA Tuning

Note: This installation process can be complex. You may skip this section and use the HuggingFace version to prepare data instead.

export AM_I_DOCKER=False
export BUILD_WITH_CUDA=True
# export CUDA_HOME=/path/to/cuda-11.7/

cd MACE
git clone https://github.com/IDEA-Research/Grounded-Segment-Anything.git
cd Grounded-Segment-Anything

# Install Segment Anything:
python -m pip install -e segment_anything

# Install Grounding DINO:
pip install --no-build-isolation -e GroundingDINO

# Install osx:
git submodule update --init --recursive
cd grounded-sam-osx && bash install.sh

# Install RAM & Tag2Text:
git clone https://github.com/xinyu1205/recognize-anything.git
pip install -r ./recognize-anything/requirements.txt
pip install -e ./recognize-anything/

Download the pretrained weights of Grounded-SAM.

cd ..    # cd Grounded-Segment-Anything

# Download the pretrained groundingdino-swin-tiny model:
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth

# Download the pretrained SAM model:
wget https://huggingface.co/lkeab/hq-sam/resolve/main/sam_hq_vit_h.pth

Install Other Dependencies

pip install diffusers==0.22.0 transformers==4.46.2 huggingface_hub==0.25.2
pip install accelerate openai omegaconf opencv-python

Data Preparation for Training MACE

To erase concepts, 8 images along with their respective segmentation masks should be generated for each concept. To prepare the data for your intended concept, configure your settings in configs/object/erase_ship.yaml and execute the command:

Grounded SAM (HuggingFace Version)

In order to ease the configuration of environment, you can also use transformers-based grounded sam from the file data_preparation_transformers.py. It does not require the CUDA version as long as you can run transformers library.

CUDA_VISIBLE_DEVICES=0 python data_preparation_transformers.py configs/object/erase_ship.yaml

All you need to do is to determine the deterctor_id and segmenter_id, the default value is detector_id = "IDEA-Research/grounding-dino-base" and segmenter_id = "facebook/sam-vit-huge" in the file. You can also change the threshold hyperparameter to get refined mask.

Grounded SAM (Official Version)

CUDA_VISIBLE_DEVICES=0 python data_preparation.py configs/object/erase_ship.yaml

Download Pre-cached Files

Before beginning the mass concept erasing process, ensure that you have pre-cached the prior knowledge (e.g., MSCOCO) and domain-specific knowledge (e.g., certain celebrities, artistic styles, or objects) you wish to retain.

Training MACE to Erase Concepts

After preparing the data, you can specify your training parameters in the same configuration file configs/object/erase_ship.yaml and run the following command:

CUDA_VISIBLE_DEVICES=0 python training.py configs/object/erase_ship.yaml

Sampling from the Finetuned Model

The finetuned model can be simply tested by running the following command to generate several images:

CUDA_VISIBLE_DEVICES=0 python inference.py \
          --num_images 3 \
          --prompt 'your_prompt' \
          --model_path /path/to/saved_model/LoRA_fusion_model \
          --save_path /path/to/save/folder

To produce lots of images based on a list of prompts with with predetermined seeds (e.g., from a CSV file ./prompts_csv/celebrity_100_concepts.csv), execute the command below (the hyperparameter step should be set to the same value as num_processes):

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch \
          --multi_gpu --num_processes=4 --main_process_port 31372 \
          src/sample_images_from_csv.py \
          --prompts_path ./prompts_csv/celebrity_100_concepts.csv \
          --save_path /path/to/save/folder \
          --model_name /path/to/saved_model/LoRA_fusion_model \
          --step 4

MACE Finetuned Model Weights

We provide several finetuned Stable Diffusion v1.4 with MACE.

Concept Type to EraseFinetuned Model
Object ErasureOneDrive link
Celebrity ErasureOneDrive link
Artistic Style ErasureOneDrive link
Explicit Content ErasureOneDrive link
<!-- - sample images using models finetuned to forget specific objects: ``` CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch \ --multi_gpu --num_processes=4 --main_process_port 13379 \ src/sample_images_objects.py \ --erased_object airplane \ --save_path /path/to/save/folder \ --model_name /path/to/model \ --step 4 ``` -->

Metrics Evaluation

During our evaluation, we employ various metrics including FID, CLIP score, CLIP classification accuracy, GCD accuracy, and NudeNet detection results.

CUDA_VISIBLE_DEVICES=0 python metrics/evaluate_fid.py --dir1 'path/to/generated/image/folder' --dir2 'path/to/coco/GT/folder'
CUDA_VISIBLE_DEVICES=0 python metrics/evaluate_clip_score.py --image_dir 'path/to/generated/image/folder' --prompts_path './prompts_csv/coco_30k.csv'
conda activate GCD
CUDA_VISIBLE_DEVICES=0 python metrics/evaluate_by_GCD.py --image_folder 'path/to/generated/image/folder'
CUDA_VISIBLE_DEVICES=0 python metrics/evaluate_by_nudenet.py --folder 'path/to/generated/image/folder'
CUDA_VISIBLE_DEVICES=0 python metrics/evaluate_clip_accuracy.py --base_folder 'path/to/generated/image/folder'

Acknowledgments

We thank the following contributors that our code is based on: Diffusers, Concept-Ablation, Forget-Me-Not, UCE.

Citation

If you find the repo useful, please consider citing.

@inproceedings{lu2024mace,
  title={Mace: Mass concept erasure in diffusion models},
  author={Lu, Shilin and Wang, Zilan and Li, Leyang and Liu, Yanzhu and Kong, Adams Wai-Kin},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={6430--6440},
  year={2024}
}