Awesome

Huggingface Diffusers Compatible SDXL Unet Rewrite

Find this useful? I'm accumulating coffee until they become a A100 gpu.

Why do this?

While huggingface diffusers library is amazing, nowdays, its UNet2DConditionModel's implementation has gotten extremely big. There are various configurations, branches during initialization, and many other "important-yet-not-related" stuff that make it hard to understand. (I am also partly to blame this, because LoRA with AttentionProcessor logic has also gotten rather huge part of the Unet implementation.) I would argue it got bigger than one researcher can now reasonably understand and maintain, let alone extend. This will of course have pros and cons, but for many researchers, this is not ideal. (Trust me, I use diffusers all the time and still get confused by the codebase.)

Since SDXL will likely be used by many researchers, I think it is very important to have concise implementations of the models, so that SDXL can be easily understood and extended.

sdxl_rewrite.py tries to remove all the unnecessary parts of the original implementation, and tries to make it as concise as possible.

Usage

SDXL Rewrite tries to be directly compatible with the original diffusers library. You can use it like this:

from sdxl_rewrite import UNet2DConditionModel
unet_new = UNet2DConditionModel().cuda().half()

# Load weights from the original model
from diffusers import DiffusionPipeline


pipe = DiffusionPipeline.from_pretrained(
        "stabilityai/stable-diffusion-xl-base-1.0",
        torch_dtype=torch.float16,
        use_safetensors=True,
        variant="fp16",
).to("cuda")

unet_new.load_state_dict(pipe.unet.state_dict())

# use the weights
pipe.unet = unet_new

Obviously in practice you would never do this. You would normaly copy this codebase and modify it to your needs, like putting adapters, loras, other modalities, etc etc.

Have a look at example.ipynb for bit more examples to use it directly with diffusers library.