Awesome
CFLD
Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis <br> Yanzuo Lu, Manlin Zhang, Andy J Ma, Xiaohua Xie, Jian-Huang Lai <br> IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), June 17-21, 2024, Seattle, USA
TL;DR
If you want to cite and compare with out method, please download the generated images from Google Drive here. (Including 256x176, 512x352 on DeepFashion, and 128x64 on Market-1501)
Newsπ₯π₯π₯
- 2024/02/27Β Β Our paper titled "Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis" is accepted by CVPR 2024.
- 2024/02/28Β Β We release the code and upload the arXiv preprint.
- 2024/03/09Β Β The checkpoints on DeepFashion dataset is released on Google Drive.
- 2024/03/09Β Β We note that the file naming used by different open source codes can be extremely confusing. To facilitate future work, we have organized the generated images of several methods that we used for qualitative comparisons in the paper. They were uniformly resized to 256X176 or 512x352, stored as png files and used the same naming format. Enjoy!π€
- 2024/03/20Β Β We upload the jupyter notebook for inference/reasoning. You could modify it as you want, e.g. replacing the conditional image with your customized one and randomly sampling a target pose from the test dataset.
- 2024/04/05Β Β Our paper is accepted as CVPR 2024 Highlight!!!
- 2024/04/10Β Β The camera-ready version is available on arXiv now. The supplementary material with more discussions and results was added.
Preparation
Install Environment
conda env create -f environment.yaml
Download DeepFashion Dataset
- Download Img/img_highres.zip from the In-shop Clothes Retrieval Benchmark of DeepFashion, unzip it under
./fashion
directory. (Password would be required, please contact the authors of DeepFashion (not us!!!) for permission.) - Download train/test pairs and keypoints from DPTN, put them under
./fashion
directory. - Make sure the tree of
./fashion
directory is as follows.fashion βββ fashion-resize-annotation-test.csv βββ fashion-resize-annotation-train.csv βββ fashion-resize-pairs-test.csv βββ fashion-resize-pairs-train.csv βββ MEN βββ test.lst βββ train.lst βββ WOMEN
- Run
generate_fashion_datasets.py
with python.
Download Pre-trained Models
- Download the following pre-trained models on demand, put them under
./pretrained_models
directory.Model Official Repository Publicly Available U-Net runwayml/stable-diffusion-v1-5 diffusion_pytorch_model.safetensors VAE runwayml/stable-diffusion-v1-5 diffusion_pytorch_model.safetensors Swin-B microsoft/Swin-Transformer swin_base_patch4_window12_384_22kto1k.pth CLIP (ablation only) openai/clip-vit-large-patch14 model.satetensors - Make sure the tree of
./pretrained_models
directory is as follows.pretrained_models βββ clip βΒ Β βββ config.json βΒ Β βββ model.safetensors βββ scheduler βΒ Β βββ scheduler_config.json βββ swin βΒ Β βββ swin_base_patch4_window12_384_22kto1k.pth βββ unet βΒ Β βββ config.json βΒ Β βββ diffusion_pytorch_model.safetensors βββ vae βββ config.json βββ diffusion_pytorch_model.safetensors
Training
For multi-gpu, run the following command by default.
bash scripts/multi_gpu/pose_transfer_train.sh 0,1,2,3,4,5,6,7
For single-gpu, run the following command by default.
bash scripts/single_gpu/pose_transfer_train.sh 0
For ablation studies, run the following command by example to specify configs.
bash scripts/multi_gpu/pose_transfer_train.sh 0,1,2,3,4,5,6,7 --config_file configs/ablation_study/no_app.yaml
Inference
For multi-gpu, run the following command by example to specify checkpoints.
bash scripts/multi_gpu/pose_transfer_test.sh 0,1,2,3,4,5,6,7 MODEL.PRETRAINED_PATH checkpoints
For single-gpu, run the following command by example to specify checkpoints.
bash scripts/single_gpu/pose_transfer_test.sh 0 MODEL.PRETRAINED_PATH checkpoints
Citation
@inproceedings{lu2024coarse,
title={Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis},
author={Lu, Yanzuo and Zhang, Manlin and Ma, Andy J and Xie, Xiaohua and Lai, Jian-Huang},
booktitle={CVPR},
year={2024}
}