Awesome
Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization (ECCV2024)
Tao Yang<sup>1</sup>, Rongyuan Wu<sup>2</sup>, Peiran Ren<sup>3</sup>, Xuansong Xie<sup>3</sup>, Lei Zhang<sup>2</sup>
<sup>1</sup>ByteDance Inc.
<sup>2</sup>Department of Computing, The Hong Kong Polytechnic University
<sup>3</sup>DAMO Academy, Alibaba Group
News
(2024-9-4) PASD-SDXL are now publicly available. Please have a try via python3 test_pasd_sdxl.py
!
(2024-8-15) PASD-SDXL will be released soon. It surpasses the PASD-SD1.5 a lot. Stay tuned! <img src="samples/RealPhoto60_06.png" width="390px"/> <img src="samples/RealPhoto60_22.png" width="390px"/> <img src="samples/RealPhoto60_09.png" width="390px"/> <img src="samples/RealPhoto60_56.png" width="390px"/>
(2024-7-1) Accepted by ECCV2024. A new version of our paper will be updated soon.
(2024-3-18) Please have a try on our colorization model via python test_pasd.y --pasd_model_path runs/pasd_color/checkpoint-180000 --control_type grayscale --high_level_info caption --use_pasd_light
. You should use the noise scheduler provided in runs/pasd_color/scheduler
which has been updated to ensure zero-terminal SNR in order to avoid the leaking residual signal from RGB image during training. Please read the updated paper for more details.
(2024-3-18) We have updated the paper. The weights and datasets are now available on Huggingface.
(2024-1-16) You may also want to check our new updates SeeSR and Phantom.
(2023-10-20) Add additional noise level via --added_noise_level
and the SR result achieves a great balance between "extremely-detailed" and "over-smoothed". Very interesting!. You can control the SR's detail level freely.
(2023-10-18) Completely solved the issues by initializing latents with input LR images. Interestingly, the SR results also become much more stable.
(2023-10-11) Colab demo is now available. Credits to Masahide Okada.
(2023-10-09) Add training dataset.
(2023-09-28) Add tiled latent to allow upscaling ultra high-resolution images. Please carefully set latent_tiled_size
as well as --decoder_tiled_size
when upscaling large images.
(2023-09-12) Add Gradio demo.
(2023-09-11) Upload pre-trained models.
(2023-09-07) Upload source codes.
Our model can do various tasks. Hope you can enjoy it.
Realistic Image SR
<img src="samples/frog.gif" width="390px"/> <img src="samples/house.gif" width="390px"/>
Old photo restoration
<img src="samples/629e4da70703193b.gif" width="390px" height="520"/> <img src="samples/27d38eeb2dbbe7c9.gif" width="390px" height="520"/>
Personalized Stylization
<img src="samples/000020x2.gif" width="390px"/> <img src="samples/000067x2.gif" width="390px"/>
Colorization
<img src="samples/000004x2.gif" width="390px"/> <img src="samples/000080x2.gif" width="390px"/>
Installation
Installation using pip
The package is not yet hosted in PyPI, so please install from github:
#pip install torch # required by xformers, which is unavailable on Mac
pip install git+https://github.com/yangxy/PASD.git
#or: pip install git+ssh://git@github.com/yangxy/PASD.git
Download checkpoint config files from main
branch to local directory ./checkpoints
:
wget -O - https://github.com/yangxy/PASD/archive/main.tar.gz | tar xz --strip=1 "PASD-main/checkpoints"
After this, download model pickle files to the ./checkpoints
and you'll be able to create models via from_pretrained()
.
Development
- Clone this repository:
git clone https://github.com/yangxy/PASD.git
cd PASD
pip install -e .
-
Download SD1.5 models from huggingface and put them into
checkpoints/stable-diffusion-v1-5
. -
Prepare training datasets. Please check
dataloader/localdataset.py
anddataloader/webdataset.py
carefully and set the paths correctly. We highly recommend to usedataloader/webdataset.py
. -
Download our training dataset. DIV2K_train_HR | DIV8K-0 | DIV8K-1 | DIV8K-2 | DIV8K-3 | DIV8K-4 | DIV8K-5 | FFHQ_5K | Flickr2K_HR-0 | Flickr2K_HR-1 | Flickr2K_HR-2 | OST_animal | OST_building | OST_grass | OST_mountain | OST_plant | OST_sky | OST_water | Unsplash2K
-
Train a PASD.
bash ./train_pasd.sh
if you want to train pasd_light, use --use_pasd_light
.
- Test PASD.
Download our pre-trained models pasd | pasd_rrdb | pasd_light | pasd_light_rrdb, and put them into runs/
.
pip install -r requirements-test.txt # install additional dependencies
python test_pasd.py # --use_pasd_light --use_personalized_model
Please read the arguments in test_pasd.py
carefully. We adopt the tiled vae method proposed by multidiffusion-upscaler-for-automatic1111 to save GPU memory.
Please try --use_personalized_model
for personalized stylizetion, old photo restoration and real-world SR. Set --conditioning_scale
for different stylized strength.
We use personalized models including majicMIX realistic(for SR and restoration), ToonYou(for stylization) and modern disney style(unet
only, for stylization). You can download more from communities and put them into checkpoints/personalized_models
.
If the default setting does not yield good results, try different --pasd_model_path
, --seed
, --prompt
, --upscale
, or --high_level_info
to get better performance.
- Gradio Demo
python gradio_pasd.py
Main idea
<img src="samples/pasd_arch.png" width="780px"/>Citation
If our work is useful for your research, please consider citing:
@inproceedings{yang2023pasd,
title={Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization},
author={Tao Yang, Rongyuan Wu, Peiran Ren, Xuansong Xie, and Lei Zhang},
booktitle={The European Conference on Computer Vision (ECCV) 2024},
year={2023}
}
Acknowledgments
Our project is based on diffusers.
Contact
If you have any questions or suggestions about this paper, feel free to reach me at yangtao9009@gmail.com.