Awesome
Modality Disentangled Discriminator for Text-to-Image Synthesis
Introduction
This project page provides pytorch code that implements the paper: "Modality Disentangled Discriminator for Text-to-Image Synthesis".
How to use
Python
- Python2.7
- Pytorch1.0+
- tensorflow (
pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.12.0-cp27-none-linux_x86_64.whl
) pip install easydict pathlib
conda install requests nltk pandas scikit-image pyyaml
Data
-
Download metadata for birds coco and save them to
data/
python google_drive.py 1O_LtUP9sch09QH3s_EBAgLEctBQ5JBSJ ./data/bird.zip
python google_drive.py 1rSnbIGNDGZeHlsUlLdahj0RJ9oo6lgH9 ./data/coco.zip
-
Download the birds image data. Extract them to
data/birds/
cd data/birds
wget http://www.vision.caltech.edu/visipedia-data/CUB-200-2011/CUB_200_2011.tgz
tar -xvzf CUB_200_2011.tgz
-
Download coco dataset and extract the images to
data/coco/
cd data/coco
wget http://images.cocodataset.org/zips/train2014.zip
wget http://images.cocodataset.org/zips/val2014.zip
unzip train2014.zip
unzip val2014.zip
mv train2014 images
cp val2014/* images
Pretrained Models
- DAMSM for bird. Download and save it to
DAMSMencoders/
python google_drive.py 1GNUKjVeyWYBJ8hEU-yrfYQpDOkxEyP3V DAMSMencoders/bird.zip
- DAMSM for coco. Download and save it to
DAMSMencoders/
python google_drive.py 1zIrXCE9F6yfbEJIbNP5-YrEe2pZcPSGJ DAMSMencoders/coco.zip
- DM-GAN-MDD for bird. Download and save it to
models
- DM-GAN-MDD for coco. Download and save it to
models
- IS for bird
python google_drive.py 0B3y_msrWZaXLMzNMNWhWdW0zVWs eval/IS/inception_finetuned_models.zip
- FID for bird
python google_drive.py 1747il5vnY2zNkmQ1x_8hySx537ZAJEtj eval/FID/bird_val.npz
- FID for coco
python google_drive.py 10NYi4XU3_bLjPEAg5KQal-l8A_d8lnL5 eval/FID/coco_val.npz
Training
- go into
code/
folder - bird:
python main.py --cfg cfg/bird_DMGANMDD.yml --gpu 0
- coco:
python main.py --cfg cfg/coco_DMGANMDD.yml --gpu 0
Validation
- Images generation:
- go into
code/
folder python main.py --cfg cfg/eval_bird_DMGANMDD.yml --gpu 0
python main.py --cfg cfg/eval_coco_DMGANMDD.yml --gpu 0
- go into
- Inception score (IS for bird, IS for coco):
cd DM-GAN-MDD/eval/IS/bird && CUDA_VISIBLE_DEVICES=0 python inception_score_bird.py --image_folder ../../../models/netG_DMGANMDD_bird
cd DM-GAN-MDD/eval/IS/coco && CUDA_VISIBLE_DEVICES=0 python inception_score_coco.py ../../../models/netG_DMGANMDD_coco
- FID:
cd DM-GAN-MDD/eval/FID && python fid_score.py --gpu 0 --path1 bird_val.npz --path2 ../../models/netG_DMGANMDD_bird
cd DM-GAN-MDD/eval/FID && python fid_score.py --gpu 0 --path1 coco_val.npz --path2 ../../models/netG_DMGANMDD_coco
Performance
As DM-GAN, we use the Pytorch implementation to measure FID score.
Model | R-precision↑ | IS↑ | Pytorch FID↓ |
---|---|---|---|
bird_AttnGAN (paper) | 67.82% ± 4.43% | 4.36 ± 0.03 | 23.98 |
bird_DMGAN (paper) | 72.31% ± 0.91% | 4.75 ± 0.07 | 16.09 |
bird_DMGAN_MDD | 79.73% ± 0.68% | 4.86 ± 0.06 | 15.76 |
coco_AttnGAN (paper) | 85.47% ± 3.69% | 25.89 ± 0.47 | 35.49 |
coco_DMGAN (paper) | 88.56% ± 0.28% | 30.49 ± 0.57 | 32.64 |
coco_DMGAN_MDD | 94.37% ± 0.36% | 34.46 ± 0.72 | 24.30 |
License
This code is released under the MIT License (refer to the LICENSE file for details).