Home

Awesome

ProFusion

ProFusion (with an encoder pre-trained on a large dataset such as CC3M) can be used to efficiently construct customization dataset, which can be used to train a tuning-free customization assistant (CAFE).

Given a testing image, the assistant can perform customized generation in a tuning-free manner. It can take complex user-input, generate text explanation and elaboration along with image, without any fine-tuning.

<br> <div style="text-align: center;"> <img src="./imgs/main_results_cafe.jpg" alt="examples" width="90%"> </div> <p align="center"> Results from CAFE </p> <br> <br> <div style="text-align: center;"> <img src="./imgs/object_results_cafe.jpg" alt="examples" width="70%"> </div> <p align="center"> Results from CAFE </p> <br>

Code for Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach.

<br> <div style="text-align: center;"> <img src="./imgs/ProFusion_example.jpg" alt="examples" width="90%"> </div> <p align="center"> Results from ProFusion </p> <br>

ProFusion is a framework for customizing pre-trained large-scale text-to-image generation models, which is Stable Diffusion 2 in our examples. <br>

<div style="text-align: center;"> <img src="./imgs/framework.jpg" alt="framework" width="90%"> </div> <p align="center"> Illustration of the proposed ProFusion </p> <br>

With ProFusion, you can generate infinite number of creative images for a novel/unique concept, with single testing image, on single GPU (~20GB are needed when fine-tune with batch size 1).

<br> <div style="text-align: center;"> <img src="./imgs/examples.png" alt="examples" width="90%"> </div> <p align="center"> Results from ProFusion </p> <br>

Example

Train Your Own Encoder

If you want to train a PromptNet encoder for other domains, or on your own dataset.

Citation

@article{zhou2023enhancing,
  title={Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach},
  author={Zhou, Yufan and Zhang, Ruiyi and Sun, Tong and Xu, Jinhui},
  journal={arXiv preprint arXiv:2305.13579},
  year={2023}
}