Home

Awesome

Asset 5

:rocket: Diffusion models are making the headlines as a new generation of powerful generative models.

However, many of the ongoing research considers solutions that are quite often quite specific and require large computational resources for training.

:beginner: DiffusionFastForward offers a general template for diffusion models for images that can be a starting point for understanding and researching diffusion-based generative models.

The code structure is simple, so that you can easily customize it to your own applications.

:construction: Disclaimer: This repository does not provide any weights to the models. The purpose of this software is to be able to train new weights on a previously unexplored type of data.

Contents

There are three elements integrated into this project:


:computer: Code

This repository offers a starting point for training diffusion models on new types of data. It can serve as a baseline that can hopefully be developed into more robust solutions based on the specific features of the performed generative task.

It includes notebooks that can be run stand-alone:

  1. Open In Collab 01-Diffusion-Sandbox - visualizations of the diffusion process
  2. Open In Collab 02-Pixel-Diffusion - basic diffusion suitable for low-resolution data
  3. Open In Collab 03-Conditional-Pixel-Diffusion - image translation with diffusion for low-resolution data
  4. Open In Collab 04-Latent-Diffusion - latent diffusion suitable for high-resolution data
  5. Open In Collab 05-Conditional-Latent-Diffusion - image translation with latent diffusion

Dependencies

Assuming torch and torchvision is installed:

pip install pytorch-lightning==1.9.3 diffusers einops

:bulb: Notes

Short summary notes are released as part of this repository and they overlap semantically with the notebooks!

  1. 01-Diffusion-Theory - visualizations of the diffusion process
  2. 02-Pixel-Diffusion - basic diffusion suitable for low-resolution data
  3. 03-Conditional-Pixel-Diffusion - image translation with diffusion for low-resolution data
  4. 04-Latent-Diffusion - latent diffusion suitable for high-resolution data
  5. 05-Conditional-Latent-Diffusion - image translation with latent diffusion

:tv: Video Course (released on YouTube)

The course is released on YouTube and provides an extension to this repository. Some additional topics are covered, such as seminal papers and on-going research work.

<img width="1596" alt="Screenshot 2023-03-01 at 19 46 20" src="https://user-images.githubusercontent.com/13435425/222248673-bfcce06c-0f5b-421b-92b2-b4ed130c0dfb.png">

The current plan for the video course (links added upon publishing):


:moneybag: Training Cost

Most examples are one of two types of models, trainable within a day:

PixelDiffusion (Good for small images :baby:) Appropriate for LR data. Direct diffusion in pixel space.

Image Resolution64x64
Training Time~10 hrs
Memory Usage~4 GB

out-pixel-conditional-1

out-pixel-conditional-2

out-pixel-conditional-3

LatentDiffusion (Good for large images :whale2:) Useful for HR data. Latent diffusion in compressed space.

Image Resolution256x256
Training Time~20 hrs
Memory Usage~5 GB

out-latent-conditional-1

out-latent-conditional-2

out-latent-conditional-3


Other Software Resources

There are many great projects focused on diffusion generative models. However, most of them involve somewhat complex frameworks that are not always suitable for learning and preliminary experimentation.

Other Educational Resources

Some excellent materials have already been published on the topic! Huge respect to all of the creators :pray: - check them out if their work has helped you!

:coffee: Blog Posts

:crystal_ball: Explanation Videos

:wrench: Implementation Videos

:mortar_board: Video Lectures/Tutorials