Awesome
DiffEngine
š Documentation | š¤ Reporting Issues
š Table of Contents
š Introduction š
DiffEngine is the open-source toolbox for training state-of-the-art Diffusion Models. Packed with advanced features including diffusers and MMEngine, DiffEngine empowers both seasoned experts and newcomers in the field to efficiently create and enhance diffusion models. Stay at the forefront of innovation with our cutting-edge platform, accelerating your journey in Diffusion Models training.
- Training state-of-the-art Diffusion Models: Empower your projects with state-of-the-art Diffusion Models. We can use Stable Diffusion, Stable Diffusion XL, DreamBooth, LoRA etc.
- Unified Config System and Module Designs: Thanks to MMEngine, our platform boasts a unified configuration system and modular designs that streamline your workflow. Easily customize hyperparameters, loss functions, and other crucial settings while maintaining a structured and organized project environment.
- Inference with diffusers.pipeline: Seamlessly transition from training to real-world application using the diffusers.pipeline module. Effortlessly deploy your trained Diffusion Models for inference tasks, enabling quick and efficient decision-making based on the insights derived from your models.
š ļø Installation š
Before installing DiffEngine, please ensure that PyTorch >= v2.0 has been successfully installed following the official guide.
Install DiffEngine
pip install git+https://github.com/okotaku/diffengine.git
šØāš« Get Started š
DiffEngine makes training easy through its pre-defined configs. These configs provide a streamlined way to start your training process. Here's how you can get started using one of the pre-defined configs:
-
Choose a config: You can find various pre-defined configs in the
configs
directory of the DiffEngine repository. For example, if you wish to train a DreamBooth model using the Stable Diffusion algorithm, you can use theconfigs/stable_diffusion_dreambooth/stable_diffusion_v15_dreambooth_lora_dog.py
. -
Start Training: Open a terminal and run the following command to start training with the selected config:
diffengine train stable_diffusion_v15_dreambooth_lora_dog
- Monitor Progress and get results: The training process will begin, and you can track its progress. The outputs of the training will be located in the
work_dirs/stable_diffusion_v15_dreambooth_lora_dog
directory, specifically when using thestable_diffusion_v15_dreambooth_lora_dog
config.
work_dirs/stable_diffusion_v15_dreambooth_lora_dog
āāā 20230802_033741
| āāā 20230802_033741.log # log file
| āāā vis_data
| āāā 20230802_033741.json # log json file
| āāā config.py # config file for each experiment
| āāā vis_image # visualized image from each step
āāā step999/unet
| āāā adapter_config.json # adapter conrfig file
| āāā adapter_model.bin # weight for inferencing with diffusers.pipeline
āāā iter_1000.pth # checkpoint from each step
āāā last_checkpoint # last checkpoint, it can be used for resuming
āāā stable_diffusion_v15_dreambooth_lora_dog.py # latest config file
An illustrative output example is provided below:
- Inference with diffusers.pipeline: Once you have trained a model, simply specify the path to the saved model and inference by the
diffusers.pipeline
module.
from pathlib import Path
import torch
from diffusers import DiffusionPipeline
from peft import PeftModel
checkpoint = Path('work_dirs/stable_diffusion_v15_dreambooth_lora_dog/step999')
prompt = 'A photo of sks dog in a bucket'
pipe = DiffusionPipeline.from_pretrained(
'runwayml/stable-diffusion-v1-5', torch_dtype=torch.float16)
pipe.to('cuda')
pipe.unet = PeftModel.from_pretrained(pipe.unet, checkpoint / "unet", adapter_name="default")
if (checkpoint / "text_encoder").exists():
pipe.text_encoder = PeftModel.from_pretrained(
pipe.text_encoder, checkpoint / "text_encoder", adapter_name="default"
)
image = pipe(
prompt,
num_inference_steps=50
).images[0]
image.save('demo.png')
š Example Notebook š
For a more hands-on introduction to DiffEngine, you can run the Example Notebook on Colaboratory. This notebook demonstrates the process of training using SDV1.5 and SDV2.1 DreamBooth configurations.
š Documentation š
For detailed user guides and advanced guides, please refer to our Documentation:
- Get Started for get started.
- Run Stable Diffusion
- Run Stable Diffusion XL
- Run Stable Diffusion DreamBooth
- Run Stable Diffusion XL DreamBooth
- Run Stable Diffusion LoRA
- Run Stable Diffusion XL LoRA
- Run Stable Diffusion ControlNet
- Run Stable Diffusion XL ControlNet
- Run IP Adapter
- Run T2I Adapter
- Run InstructPix2Pix
- Run Wuerstchen
- Run Wuerstchen LoRA
- Run LCM XL
- Run LCM XL LoRA
- Run PixArt-Ī±
- Run PixArt-Ī± LoRA
- Run PixArt-Ī± DreamBooth
- Inference
- Introduction to DiffEngine
- Train ControlNet with DiffEngine
- On Architectural Compression of Text-to-Image Diffusion Models
- SSD-1B: A Leap in Efficient T2I Generation
š Model Zoo š
<details open> <div align="center"> <b>Supported algorithms</b> </div> <table align="center"> <tbody> <tr align="center" valign="bottom"> <td> <b>Stable Diffusions</b> </td> <td> <b>Stable Diffusion XLs</b> </td> <td> <b>DeepFloyd IFs</b> </td> <td> <b>Others</b> </td> </tr> <tr valign="top"> <td> <ul> <li><a href="diffengine/configs/stable_diffusion/README.md">Stable Diffusion (2022)</a></li> <li><a href="diffengine/configs/stable_diffusion_controlnet/README.md">ControlNet (ICCV'2023)</a></li> <li><a href="diffengine/configs/stable_diffusion_dreambooth/README.md">DreamBooth (CVPR'2023)</a></li> <li><a href="diffengine/configs/stable_diffusion_lora/README.md">LoRA (ICLR'2022)</a></li> <li><a href="diffengine/configs/distill_sd_dreambooth/README.md">Distill SD DreamBooth (2023)</a></li> <li><a href="diffengine/configs/stable_diffusion_inpaint/README.md">Inpaint</a></li> </ul> </td> <td> <ul> <li><a href="diffengine/configs/stable_diffusion_xl/README.md">Stable Diffusion XL (2023)</a></li> <li><a href="diffengine/configs/stable_diffusion_xl_controlnet/README.md">ControlNet (ICCV'2023)</a></li> <li><a href="diffengine/configs/stable_diffusion_xl_dreambooth/README.md">DreamBooth (CVPR'2023)</a></li> <li><a href="diffengine/configs/stable_diffusion_xl_lora/README.md">LoRA (ICLR'2022)</a></li> <li><a href="diffengine/configs/stable_diffusion_xl_controlnet_small/README.md">ControlNet Small (2023)</a></li> <li><a href="diffengine/configs/t2i_adapter/README.md">T2I-Adapter (2023)</a></li> <li><a href="diffengine/configs/ip_adapter/README.md">IP-Adapter (2023)</a></li> <li><a href="diffengine/configs/esd/README.md">Erasing Concepts from Diffusion Models (2023)</a></li> <li><a href="diffengine/configs/ssd_1b/README.md">SSD-1B (2023)</a></li> <li><a href="diffengine/configs/instruct_pix2pix/README.md">InstructPix2Pix (2022)</a></li> <li><a href="diffengine/configs/loha/README.md">LoHa (ICLR'2022)</a></li> <li><a href="diffengine/configs/lokr/README.md">LoKr (2022)</a></li> <li><a href="diffengine/configs/oft/README.md">OFT (NeurIPS'2023)</a></li> <li><a href="projects/controlnetxs/README.md">ControlNet-XS (2023)</a></li> <li><a href="diffengine/configs/stable_diffusion_xl_inpaint/README.md">Inpaint</a></li> </ul> </td> <td> <ul> <li><a href="diffengine/configs/deepfloyd_if/README.md">DeepFloyd IF (2023)</a></li> <li><a href="diffengine/configs/deepfloyd_if_dreambooth/README.md">DreamBooth (CVPR'2023)</a></li> </ul> </td> <td> <ul> <li><a href="diffengine/configs/min_snr_loss/README.md">Min-SNR Loss (ICCV'2023)</a></li> <li><a href="diffengine/configs/debias_estimation_loss/README.md">DeBias Estimation Loss (2023)</a></li> <li><a href="diffengine/configs/offset_noise/README.md">Offset Noise (2023)</a></li> <li><a href="diffengine/configs/pyramid_noise/README.md">Pyramid Noise (2023)</a></li> <li><a href="diffengine/configs/input_perturbation/README.md">Input Perturbation (2023)</a></li> <li><a href="diffengine/configs/timesteps_bias/README.md">Time Steps Bias (2023)</a></li> <li><a href="diffengine/configs/v_prediction/README.md">V Prediction (ICLR'2022)</a></li> <li><a href="diffengine/configs/diffusion_dpo/README.md">Diffusion DPO (2023)</a></li> </ul> </td> </tr> </td> </tr> </tbody> <tbody> <tr align="center" valign="bottom"> <td> <b>Wuerstchen</b> </td> <td> <b>Latent Consistency Models</b> </td> <td> <b>PixArt-Ī±</b> </td> <td> <b>Kandinsky</b> </td> </tr> <tr valign="top"> <td> <ul> <li><a href="diffengine/configs/wuerstchen/README.md">Wuerstchen (2023)</a></li> <li><a href="diffengine/configs/wuerstchen_lora/README.md">LoRA (ICLR'2022)</a></li> </ul> </td> <td> <ul> <li><a href="diffengine/configs/lcm/README.md">Latent Consistency Models (2023)</a></li> <li><a href="diffengine/configs/lcm_lora/README.md">LoRA (ICLR'2022)</a></li> </ul> </td> <td> <ul> <li><a href="diffengine/configs/pixart_alpha/README.md">PixArt-Ī± (2023)</a></li> <li><a href="diffengine/configs/pixart_alpha_lora/README.md">LoRA (ICLR'2022)</a></li> <li><a href="diffengine/configs/pixart_alpha_dreambooth/README.md">DreamBooth (CVPR'2023)</a></li> </ul> </td> <td> <ul> <li><a href="diffengine/configs/kandinsky_v22/README.md">Kandinsky 2.2 (2023)</a></li> <li><a href="diffengine/configs/kandinsky_v3/README.md">Kandinsky 3 (2023)</a></li> </ul> </td> </tr> </td> </tr> </tbody> <tbody> <tr align="center" valign="bottom"> <td> <b>aMUSEd</b> </td> </tr> <tr valign="top"> <td> <ul> <li><a href="diffengine/configs/amused/README.md">aMUSEd (2024)</a></li> </ul> </td> </tr> </td> </tr> </tbody> </table> </details>š Contributing š
We appreciate all contributions to improve clshub. Please refer to CONTRIBUTING.md for the contributing guideline.
š« License š
This project is released under the Apache 2.0 license.
šļø Citation š
If DiffEngine is helpful to your research, please cite it as below.
@misc{diffengine2023,
title = {{DiffEngine}: diffusers training toolbox with mmengine},
author = {{DiffEngine Contributors}},
howpublished = {\url{https://github.com/okotaku/diffengine}},
year = {2023}
}
š» Sponsors
takuoko is a member of Z by HP Data Science Global Ambassadors. Special Thanks to Z by HP for sponsoring me a Z8G4 Workstation with dual A6000 GPU and a ZBook with RTX5000 GPU.
š¤ Acknowledgement š
This repo borrows the architecture design and part of the code from mmengine, mmagic and diffusers.
Also, please check the following openmmlab and huggingface projects and the corresponding Documentation.
@article{mmengine2022,
title = {{MMEngine}: OpenMMLab Foundational Library for Training Deep Learning Models},
author = {MMEngine Contributors},
howpublished = {\url{https://github.com/open-mmlab/mmengine}},
year={2022}
}
@misc{mmagic2023,
title = {{MMagic}: {OpenMMLab} Multimodal Advanced, Generative, and Intelligent Creation Toolbox},
author = {{MMagic Contributors}},
howpublished = {\url{https://github.com/open-mmlab/mmagic}},
year = {2023}
}
@misc{von-platen-etal-2022-diffusers,
author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Thomas Wolf},
title = {Diffusers: State-of-the-art diffusion models},
year = {2022},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/huggingface/diffusers}}
}
@Misc{peft,
title = {PEFT: State-of-the-art Parameter-Efficient Fine-Tuning methods},
author = {Sourab Mangrulkar and Sylvain Gugger and Lysandre Debut and Younes Belkada and Sayak Paul and Benjamin Bossan},
howpublished = {\url{https://github.com/huggingface/peft}},
year = {2022}
}