Home

Awesome

ManiGAN

Pytorch implementation for ManiGAN: Text-Guided Image Manipulation. The goal is to semantically edit parts of an image according to the given text while preserving text-irrelevant contents.

Overview

<img src="archi.jpg" width="940px" height="295px"/>

ManiGAN: Text-Guided Image Manipulation.
Bowen Li, Xiaojuan Qi, Thomas Lukasiewicz, Philip H. S. Torr.<br> University of Oxford <br> CVPR 2020 <br>

Data

  1. Download the preprocessed metadata for bird and coco, and save both into data/
  2. Download bird dataset and extract the images to data/birds/
  3. Download coco dataset and extract the images to data/coco/

Training

All code was developed and tested on CentOS 7 with Python 3.7 (Anaconda) and PyTorch 1.1.

DAMSM model includes text encoder and image encoder

python pretrain_DAMSM.py --cfg cfg/DAMSM/bird.yml --gpu 0
python pretrain_DAMSM.py --cfg cfg/DAMSM/coco.yml --gpu 1

ManiGAN, main module

python main.py --cfg cfg/train_bird.yml --gpu 2
python main.py --cfg cfg/train_coco.yml --gpu 3

*.yml files include configuration for training and testing.

ManiGAN, detail correction module (DCM)

Save trained main module to models/

python DCM.py --cfg cfg/train_bird.yml --gpu 2
python DCM.py --cfg cfg/train_coco.yml --gpu 3

Pretrained DAMSM Model

Pretrained ManiGAN Model

Testing

python main.py --cfg cfg/eval_bird.yml --gpu 4
python main.py --cfg cfg/eval_coco.yml --gpu 5

Evaluation

Code Structure

Citation

If you find this useful for your research, please use the following.

@inproceedings{li2020manigan,
  title={Manigan: Text-guided image manipulation},
  author={Li, Bowen and Qi, Xiaojuan and Lukasiewicz, Thomas and Torr, Philip HS},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={7880--7889},
  year={2020}
}

Acknowledgements

This code borrows heavily from ControlGAN repository. Many thanks.