Awesome

Latent Subspace Optimization

The official Pytorch implementation of our paper Where is My Spot? Few-shot Image Generation via Latent Subspace Optimization, CVPR 2023.

framework

Where is My Spot? Few-shot Image Generation via Latent Subspace Optimization

Chenxi Zheng, Bangzhen Liu, Xuemiao Xu, Huaidong Zhang, and Shengfeng He

Paper

Environment

Python 3.7
Pytorch 1.7.1

Prepare the environment using either conda or pip.

cd envs
conda env create -f environment.yaml
cd ..

cd envs
conda create -n LSO python=3.7
conda activate LSO
pip install -r requirements.txt
cd ..

If installation of pytorch fails or a custom Pytorch version is needed, please install pytorch following the official guidance manually.

Getting started

Before training, prepare the pretrained ckpts optimized with seen categories.

Dataset: Follow LoFGAN and download the datasets.
Pretrained StyleGAN2: download the pretrained checkpoints.
Inverted latent codes: download inverted latent codes.
Weights for feature extractor: download the weights.

Note that the latent codes only contain the subset for image generation $\mathbb{S}_{gen}^{c}$ in Sec. 4.2. For example, the shape of tensor in flowers_unseen17_0-10_step1300.npy is $[17*10, 12, 512]$, which is corresponding to the slice $[85:102, 0:10, :, :, :]$ of images in flower_c8189_s128_data_rgb.npy which has a tensor shape of $[102, 40, 128, 128, 3]$.

Unzip the files and set up the paths in configs/default_configs.py.

DATA_PATH = <PATH_TO_DATASET_NPY>
CKPT_PATH = <PATH_TO_STYLEGAN2_CKPT>
WS_PATH = <PATH_TO_WS>
IDCKPT_PATH = <PATH_TO_IDWEIGHTS>

Training

Multi-task for quantitative evaluation

python train_unseen.py \
    --outdir <output_dir> \
    --k_shot <k> \
    --dataset_name <dataset_name>

The code does NOT support multi-GPU processing in the current version but allows separate runs for different categories. For instance, execute a run for categories $[0, 8)$ on GPU:0 and execute another for categories $[8, 17)$ on GPU:1. The indices can be specified with --classes=<start_idx>-<end_idx>.
Results are saved in directory <output_dir>/<run_dir>/few-shot_samples/magnitude=<magnitude-intensity>. For each category, we randomly generate 128 images for quantitative evaluation.
Refer to the main definition in train_unseen.py for command examples.

Single-task for detailed visualization

We also provide single-task optimization for visualization and detailed optimization evaluation.

python train_unseen.py \
    --outdir <output_dir> \
    --k_shot <k> \
    --single_task <cidx> <idx_1,...,idx_k> \
    --dataset_name <dataset_name>

Specify the argument --single_task with the category number and the image indices.
To save the checkpoints, add argument --save_ckpt as true.

Evaluation

(optional) If the images are generated in the separated runs, use merger.py to combine all the images.

python merger.py \
    --path <output_dir> \
    --idx <runidx_1>,...,<runidx_n>

Refer to the main definition in merger.py for command examples.

Quantitative evaluation of the generated images.

python main_metric_calculate.py \
    --real_dir <real_directory> \
    --fake_dir <fake_directory> \
    --dataset_name <dataset_name>

Refer to the main definition in main_metric_calculate.py for command examples.

The calculation of LPIPS is significantly accelerated by first extracting the features of each image. Please refer to metrics/lpips_fs/lpips_fs.py.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{zheng2023my,
title={Where Is My Spot? Few-Shot Image Generation via Latent Subspace Optimization},
author={Zheng, Chenxi and Liu, Bangzhen and Zhang, Huaidong and Xu, Xuemiao and He, Shengfeng},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={3272--3281},
year={2023}
}

Acknowledgement

This project builds upon and has been inspired by the following repositories:

Training and evaluation code are based on StyleGAN2-ADA and LoFGAN.
The code for calculating FID is based on pytorch-fid.

We would like to thank the entire open-source community for fostering an environment of collaboration and knowledge sharing.

License

This repository is under MIT license.