Awesome
CF-CLIP (Towards Counterfactual Image Manipulation via CLIP)
This repository is an official PyTorch implementation of the ACM MM 2022 paper "Towards Counterfactual Image Manipulation via CLIP".
Setup
The code relies on the official implementation of CLIP, and the Rosinality pytorch implementation of StyleGAN2.
Requirements
For all the methods described in the paper, is it required to have:
- Anaconda
- CLIP
Specific requirements for each method are described in its section. To install CLIP please run the following commands:
conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=<CUDA_VERSION>
pip install ftfy regex tqdm gdown
pip install git+https://github.com/openai/CLIP.git
Pretrained Models
Please download the following pertrained models and place them in ./pretrained
folder.
StyleGAN
For AFHQ Dog and Cat, we can convert the tensorflow version pretrained model to pytorch version using convert_weight.py
.
Face Recognition & VGG
Latent Codes
Inverted CelebA-HQ via e4e:
Random Sample
For cat and dog, we randomly sample w code (1*512) using GetCode.py
, which uses the tensorflow version pretrained model. ('.pkl'). In this case, we need to set w_space
option of training script to True
.
Usage
Pretrained Models
We provided pretrained models for different face, AFHQ Dog and Cat cases in our paper here. You may put them under folder pretrained
after downloading.
Training
- The main training script is placed in
mapper/scripts/train.py
. - Training arguments can be found at
mapper/options/train_options.py
. - Intermediate training results are saved to opts.exp_dir. This includes checkpoints, train outputs, and test outputs. Additionally, if you have tensorboard installed, you can visualize tensorboard logs in opts.exp_dir/logs. Note that
- To resume a training, please provide
--checkpoint_path
. --description
is where you provide the driving text.
Example for training a mapper for the green lipstick:
cd mapper
python scripts/train.py --exp_dir ../results/green_lipstick --description "green lipstick"
You may refer train.sh
for the example of training AFHQ Dog/Cat cases.
Inference
- The main inferece script is placed in
mapper/scripts/inference.py
. - Inference arguments can be found at
mapper/options/test_options.py
. - Adding the flag
--couple_outputs
will save image containing the input and output images side-by-side.
You may refer test.sh
for reference.
Citation
If you find CF-CLIP useful or inspiring, please consider citing:
@inproceedings{yu2022-CFCLIP,
title = {Towards Counterfactual Image Manipulation via CLIP},
author = {Yu, Yingchen and Zhan, Fangneng and Wu, Rongliang and Zhang, Jiahui and Lu, Shijian and Cui, Miaomiao and Xie, Xuansong and Hua, Xian-Sheng and Miao, Chunyan},
booktitle = {Proceedings of the 30th ACM International Conference on Multimedia},
year = {2022}
}
Acknowledgments
This code borrows heavily from StyleCLIP, StyleGAN-NADA and InfoNCE, we apprecite the authors for sharing their codes.