Home

Awesome

Image Completion Transformer (ICT)

<img src='imgs/teaser.png'/>

Project Page | Paper (ArXiv) | Pre-trained Models :fire: | Supplemental Material

This repository is the official pytorch implementation of our ICCV 2021 paper, High-Fidelity Pluralistic Image Completion with Transformers.

Ziyu Wan<sup>1</sup>, Jingbo Zhang<sup>1</sup>, Dongdong Chen<sup>2</sup>, Jing Liao<sup>1</sup> <br> <sup>1</sup>City University of Hong Kong, <sup>2</sup>Microsoft Cloud AI

:balloon: Prerequisites

pip install -r requirements.txt

To directly inference, first download the pretrained models from Dropbox, then

cd ICT
wget -O ckpts_ICT.zip https://www.dropbox.com/s/we886b1fqf2qyrs/ckpts_ICT.zip?dl=1
unzip ckpts_ICT.zip

If Dropbox doesn't work for you, please try the Baidu Driver. Verification Code: 6g4f

Another option to download the checkpoints is using OneDrive.

Some tips:

:star2: Pipeline

<img src='imgs/Pipeline.png'/>

Why transformer?

Compared with traditional CNN-based methods, transformers have better capability in understanding shape and geometry. <img src='imgs/structure.png'/>

:rocket: Training

1) Transformer

cd Transformer
python main.py --name [exp_name] --ckpt_path [save_path] \
               --data_path [training_image_path] \
               --validation_path [validation_image_path] \
               --mask_path [mask_path] \
               --BERT --batch_size 64 --train_epoch 100 \
               --nodes 1 --gpus 8 --node_rank 0 \
               --n_layer [transformer_layer #] --n_embd [embedding_dimension] \
               --n_head [head #] --ImageNet --GELU_2 \
               --image_size [input_resolution]

Notes of transformer:

2) Guided Upsampling

cd Guided_Upsample
python train.py --model 2 --checkpoints [save_path] \
                --config_file ./config_list/config_template.yml \
                --Generator 4 --use_degradation_2

Notes of guided upsampling:

:zap: Inference

We provide very covenient and neat script for inference.

python run.py --input_image [test_image_folder] \
              --input_mask [test_mask_folder] \
              --sample_num 1  --save_place [save_path] \
              --ImageNet --visualize_all

Notes of inference:

More results

FFHQ <img src='imgs/FFHQ.png'/>

Places2 <img src='imgs/Places2.png'/>

ImageNet <img src='imgs/ImageNet.png'/>

:hourglass_flowing_sand: To Do

:notebook_with_decorative_cover: Citation

If you find our work useful for your research, please consider citing the following papers :)

@article{wan2021high,
  title={High-Fidelity Pluralistic Image Completion with Transformers},
  author={Wan, Ziyu and Zhang, Jingbo and Chen, Dongdong and Liao, Jing},
  journal={arXiv preprint arXiv:2103.14031},
  year={2021}
}

The real-world application of image inpainting is also ready! Try and cite our old photo restoration algorithm here.

@inproceedings{wan2020bringing,
title={Bringing Old Photos Back to Life},
author={Wan, Ziyu and Zhang, Bo and Chen, Dongdong and Zhang, Pan and Chen, Dong and Liao, Jing and Wen, Fang},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={2747--2757},
year={2020}
}

:bulb: Acknowledgments

This repo is built upon minGPT and Edge-Connect. We also thank the provided cluster centers from OpenAI.

:incoming_envelope: Contact

This repo is currently maintained by Ziyu Wan (@Raywzy) and is for academic research use only. Discussions and questions are welcome via raywzy@gmail.com.