Awesome

Spectral Hint GAN

This repo hosts the official implementary of:

Xingqian Xu, Shant Navasardyan, Vahram Tadevosyan, Andranik Sargsyan, Yadong Mu and Humphrey Shi, Image Completion with Heterogeneously Filtered Spectral Hints, Paper arXiv Link.

News

[2022.11.12]: Evaluation code and pretrained model released.
[2022.11.07]: Our paper is accepted in WACV23.
[2022.11.06]: Repo initiated.

Introduction

Spectral Hint GAN (SH-GAN) is an high-performing inpainting network enpowered by CoModGAN and novel spectral processing techniques. SH-GAN reaches state-of-the-art on FFHQ and Places2 with freeform masks.

Network and Algorithm

The overall structure of our SH-GAN shows in the following figure:

The sturcture of our Spectral Hint Unit shows in the following graph:

Heterogeneous Filtering Explaination:

1x1 Convolution in Fourier domain leads a uniform (homogeneous) transform from one spectral space to another.
ReLU in Fourier domain is like a value-dependend band pass filter that zero out some frequency values.
We promote the heterogeneous transforms in spectral space, in which the frequency value transformations are depended on the frequency bands.

Gaussian Split Algorithm Explaination:

Gaussian Split is a spectral space downsampling method that well-suit deep learning structures. A quick intuition is that it likes Wavelet Transform that can pass information in different frequency band to its corresponding resolution.

Data

We use FFHQ and Places2 as our main dataset. Download these dataset from the following official link: FFHQ, Places2

Directory of FFHQ data for our code:

├── data
│   └── ffhq
│       └── ffhq256x256.zip
│       └── ffhq512x512.zip

Directory of Places2 data for our code:

Download the data_challenge.zip from Places2 official website and decompress it to /data/Places2
Same for val_large.zip

├── data
│   └── Places2
│       └── data_challenge
│           ...
│       └── val_large
│           ...

Setup

conda create -n shgan python=3.8
conda activate shgan
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
pip install -r requirement.txt

Results and pretrained models

	DIM	DATA	FID	LPIPS	PSNR	SSIM	Download
CoModGAN	256	FFHQ	4.7755	0.2568	16.24	0.5913
SH-GAN	256	FFHQ	4.3459	0.2542	16.37	0.5911	link
CoModGAN	512	FFHQ	3.6996	0.2469	18.46	0.6956
SH-GAN	512	FFHQ	3.4134	0.2447	18.43	0.6936	link
CoModGAN	256	Places2	9.3621	0.3990	14.50	0.4923
SH-GAN	256	Places2	7.5036	0.3940	14.58	0.4958	link
CoModGAN	512	Places2	7.9735	0.3420	16.00	0.5953
SH-GAN	512	Places2	7.0277	0.3386	16.03	0.5973	link

Evaluation

Here are the one-line shell commends to evaluation SH-GAN on FFHQ 256/512 and Places2 256/512.

python main.py --experiment shgan_ffhq256_eval --gpu 0 1 2 3 4 5 6 7 --eval 99999
python main.py --experiment shgan_ffhq512_eval --gpu 0 1 2 3 4 5 6 7 --eval 99999
python main.py --experiment shgan_places256_eval --gpu 0 1 2 3 4 5 6 7 --eval 99999
python main.py --experiment shgan_places512_eval --gpu 0 1 2 3 4 5 6 7 --eval 99999

Also you need to:

Download the data, put them as the directories mentioned in Data session.
Create ./pretrained and move all downloaded pretrained models in it.
Create ./log/shgan_ffhq/99999_eval and ./log/shgan_places2/99999_eval

Some simple things to do to resolve the issues:

The evaluation code caches and later relys on .cache/****_real_feat.npy for FID calculation. If it corrupts, numbers will be wrong. But you can simple remove it and the code will auto recompute one.
The final stage of FID computation requires CPU resource so it is normal to be slow, so be patient.

Training

coming soon

Citation

@inproceedings{xu2023image,
  title={Image Completion with Heterogeneously Filtered Spectral Hints},
  author={Xu, Xingqian and Navasardyan, Shant and Tadevosyan, Vahram and Sargsyan, Andranik and Mu, Yadong and Shi, Humphrey},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={4591--4601},
  year={2023}
}

Acknowledgement

Part of the codes reorganizes/reimplements code from the following repositories: Comodgan official Github and Stylegan2-ADA official Github.