Awesome
SD-VITON-Virtual-Try-On
This is the official repository for the following paper:
Towards Squeezing-Averse Virtual Try-On via Sequential Deformation [arxiv]
Sang-Heon Shim, Jiwoo Chung, Jae-Pil Heo
Accepted by AAAI 2024.
Notice
This repository is currently built only for sharing the source code of an academic research paper.
It has several limitations. Please check out them at below.
News
- 2024-01-31 We have released the source codes and checkpoints.
Installation
Clone this repository:
git clone https://github.com/SHShim0513/SD-VITON.git
cd ./SD-VITON/
Install PyTorch and other dependencies:
conda create -n {env_name} python=3.8
conda activate {env_name}
conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch-lts -c nvidia
pip install opencv-python torchgeometry Pillow tqdm tensorboardX scikit-image scipy timm==0.4.12
Dataset
We train and evaluate our model using the dataset from the following link.
We assume that you have downloaded it into ./data
.
Inference
Here are the download links for each model checkpoint:
Dataset | Network Type | Output Resolution | Google Cloud |
---|---|---|---|
VITON-HD | Try-on condition generator | Appearance flows with 128 x 96 | Download |
VITON-HD | Try-on image generator | Images with 1024 x 768 | Download |
- AlexNet (LPIPS): link, we assume that you have downloaded it into
./eval_models/weights/v0.1
.
python3 test_generator.py --occlusion --test_name {test_name} --tocg_checkpoint {condition generator ckpt} --gpu_ids {gpu_ids} --gen_checkpoint {image generator ckpt} --datasetting unpaired --dataroot {dataset_path} --data_list {pair_list_textfile} --composition_mask
Training
Try-on condition generator
python3 train_condition.py --gpu_ids {gpu_ids} --Ddownx2 --Ddropout --interflowloss --occlusion --tvlambda_tvob 2.0 --tvlambda_taco 2.0
Try-on image generator
python3 train_generator.py --name test -b 4 -j 8 --gpu_ids {gpu_ids} --fp16 --tocg_checkpoint {condition generator ckpt path} --occlusion --composition_mask
This stage takes approximately 4 days with two A6000 GPUs.
To use "--fp16" option, you should install apex library.
Limitations
Our work still has several limitations that are not an unique problem of ours in our best knowledge.
Issue #1: crack
Several samples have sufferred from a crack artifact.
In our best knowledge, the crack is amplified due to the up-sizing of last appearance flows (AFs).
E.g., our network infers the last AFs with 128 x 96 resolution, and then up-scales to 1024 x 768.
Thereby, the crack regions are extended.
A slightly reduceable way will be to infer the last AFs with more closer to an image resolution (see "After").
We provide a checkpoint, where networks infer the AFs with 256 x 192 and an image with 512 x 384 resolution.
Dataset | Network Type | Output Resolution | Google Cloud |
---|---|---|---|
VITON-HD | Try-on condition generator | Appearance flows with 256 x 192 | Download |
VITON-HD | Try-on image generator | Images with 512 x 384 | Download |
The corresponding script for inference is as follows:
python3 test_generator.py --occlusion --test_name {test_name} --tocg_checkpoint {condition generator ckpt} --gpu_ids {gpu_ids} --gen_checkpoint {image generator ckpt} --datasetting unpaired --dataroot {dataset_path} --data_list {pair_list_textfile} --fine_width 384 --fine_height 512 --num_upsampling_layers more --cond_G_ngf 48 --cond_G_input_width 384 --cond_G_input_height 512 --cond_G_num_layers 6
Issue #2: clothes behind the neck
Same as other methods, our network cannot fully remove the clothes textures behind the neck.
Thereby, it remains in the generated samples.
A solution would be to mask out such regions when pre-processing the inputs.
We did not apply such additional technique, since it was not included in a dataset.
Acknowledgments
This repository is built based on HR-VITON repository. Thanks for the great work.
Citation
If you find this work useful for your research, please cite our paper:
@article{shim2023towards,
title={Towards Squeezing-Averse Virtual Try-On via Sequential Deformation},
author={Shim, Sang-Heon and Chung, Jiwoo and Heo, Jae-Pil},
journal={arXiv preprint arXiv:2312.15861},
year={2023}
}