Home

Awesome

<div align="center">

LinFusion

<a href="https://arxiv.org/abs/2409.02097"><img src="https://img.shields.io/badge/arXiv-2409.02097-A42C25.svg" alt="arXiv"></a> <a href="https://lv-linfusion.github.io"><img src="https://img.shields.io/badge/ProjectPage-LinFusion-376ED2#376ED2.svg" alt="Home Page"></a> <a href="https://huggingface.co/spaces/Huage001/LinFusion-SD-v1.5"><img src="https://img.shields.io/static/v1?label=HuggingFace&message=gradio demo&color=yellow"></a>

</div>

LinFusion: 1 GPU, 1 Minute, 16K Image <br> Songhua Liu, Weuhao Yu, Zhenxiong Tan, and Xinchao Wang <br> Learning and Vision Lab, National University of Singapore <br>

🔥News

[2024/09/28] We release evaluation codes on the COCO benchmark!

[2024/09/27] We successfully integrate LinFusion to DistriFusion, an effective and efficient strategy for generating an image in parallel, and achieve more significant acceleration! Please refer to the example here!

[2024/09/26] We enable 16K image generation with merely 24G video memory! Please refer to the example here!

[2024/09/20] We release a more advanced pipeline for ultra-high-resolution image generation using SD-XL! It can be used for text-to-image generation and image super-resolution!

[2024/09/20] We release training codes for Stable Diffusion XL here!

[2024/09/13] We release LinFusion models for Stable Diffusion v-2.1 and Stable Diffusion XL!

[2024/09/13] We release training codes for Stable Diffusion v-1.5, v-2.1, and their variants here!

[2024/09/08] We release codes for 16K image generation here!

[2024/09/05] Gradio demo for SD-v1.5 is released! Text-to-image, image-to-image, and IP-Adapter are supported currently.

Supported Models

  1. Yuanshi/LinFusion-1-5: For Stable Diffusion v-1.5 and its variants. <a href="https://huggingface.co/Yuanshi/LinFusion-1-5"><img src="https://img.shields.io/badge/%F0%9F%A4%97-LinFusion for SD v1.5-yellow"></a>
  2. Yuanshi/LinFusion-2-1: For Stable Diffusion v-2.1 and its variants. <a href="https://huggingface.co/Yuanshi/LinFusion-2-1"><img src="https://img.shields.io/badge/%F0%9F%A4%97-LinFusion for SD v2.1-yellow"></a>
  3. Yuanshi/LinFusion-XL: For Stable Diffusion XL and its variants. <a href="https://huggingface.co/Yuanshi/LinFusion-XL"><img src="https://img.shields.io/badge/%F0%9F%A4%97-LinFusion for SD XL-yellow"></a>

Quick Start

Gradio Demo

Ultrahigh-Resolution Generation

From the perspective of efficiency, our method supports high-resolution generation such as 16K images. Nevertheless, directly applying diffusion models trained on low resolutions for higher-resolution generation can result in content distortion and duplication. To tackle this challenge, we apply following techniques:

Training

Evaluation

Following GigaGAN, we use 30,000 COCO captions to generate 30,000 images for evaluation. FID against COCO val2014 is reported as a metric, and CLIP text cosine similarity is used to reflect the text-image alignment.

ToDo

Acknowledgement

Citation

If you finds this repo is helpful, please consider citing:

@article{liu2024linfusion,
  title     = {LinFusion: 1 GPU, 1 Minute, 16K Image},
  author    = {Liu, Songhua and Yu, Weihao and Tan, Zhenxiong and Wang, Xinchao},
  year      = {2024},
  eprint    = {2409.02097},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}