Awesome

FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

[Paper] [Project Page] 

Sicheng Mo1*, Fangzhou Mu2*, Kuan Heng Lin1, Yanli Liu3, Bochen Guan3, Yin Li2, Bolei Zhou1 1 UCLA, 2 University of Wisconsin-Madison, 3 Innopeak Technology, Inc * Equal contribution Computer Vision and Pattern Recognition (CVPR), 2024

Overview

This is the official implementation of FreeControl, a Generative AI algorithm for controllable text-to-image generation using pre-trained Diffusion Models.

Changelog

10/21/2024: Added SDXL pipeline (thanks to @shirleyzhu233).
02/19/2024: Initial code release. The paper is accepted to CVPR 2024.

Getting Started

Environment Setup

We provide a conda env file for environment setup.

conda env create -f environment.yml
conda activate freecontrol
pip install -U diffusers 
pip install -U gradio

Sample Semantic Bases

We provide three sample scripts in the scripts folder (one for each base model) to showcase how to compute target semantic bases.
You may also download pre-computed bases from google drive. Put them in the dataset folder and launch the gradio demo.

Gradio demo

We provide a graphical user interface (GUI) for users to try out FreeControl. Run the following command to start the demo.

python gradio_app.py

Galley:

We are building a gallery of images generated with FreeControl. You are welcome to share your generated images with us.

Contact

Sicheng Mo (smo3@cs.ucla.edu)

Reference

@article{mo2023freecontrol,
  title={FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition},
  author={Mo, Sicheng and Mu, Fangzhou and Lin, Kuan Heng and Liu, Yanli and Guan, Bochen and Li, Yin and Zhou, Bolei},
  journal={arXiv preprint arXiv:2312.07536},
  year={2023}
}