Awesome

Last Commit

GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis (CVPR 2023)

A high-quality, fast, and efficient text-to-image model

Official Pytorch implementation for our paper GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis by Ming Tao, Bing-Kun Bao, Hao Tang, Changsheng Xu.

Generated Images <img src="results.jpg"/>

Requirements

python 3.9
Pytorch 1.9
At least 1x24GB 3090 GPU (for training)
Only CPU (for sampling)

GALIP is a small and fast generative model which can generate multiple pictures in one second even on the CPU.

Installation

Clone this repo.

git clone https://github.com/tobran/GALIP
pip install -r requirements.txt

Install CLIP

Preparation (Same as DF-GAN)

Datasets

Download the preprocessed metadata for birds coco and extract them to data/
Download the birds image data. Extract them to data/birds/
Download coco2014 dataset and extract the images to data/coco/images/

Training

cd GALIP/code/

Train the GALIP model

For bird dataset: bash scripts/train.sh ./cfg/bird.yml
For coco dataset: bash scripts/train.sh ./cfg/coco.yml

Resume training process

If your training process is interrupted unexpectedly, set state_epoch, log_dir, and pretrained_model_path in train.sh to resume training.

TensorBoard

Our code supports automate FID evaluation during training, the results are stored in TensorBoard files under ./logs. You can change the test interval by changing test_interval in the YAML file.

For bird dataset: tensorboard --logdir=./code/logs/bird/train --port 8166
For coco dataset: tensorboard --logdir=./code/logs/coco/train --port 8177

Evaluation

Download Pretrained Model

GALIP for COCO. Download and save it to ./code/saved_models/pretrained/
GALIP for CC12M. Download and save it to ./code/saved_models/pretrained/

Evaluate GALIP models

cd GALIP/code/

set pretrained_model in test.sh

For bird dataset: bash scripts/test.sh ./cfg/bird.yml
For COCO dataset: bash scripts/test.sh ./cfg/coco.yml
For CC12M (zero-shot on COCO) dataset: bash scripts/test.sh ./cfg/coco.yml

Performance

The released model achieves better performance than the paper version.

Model	COCO-FID↓	COCO-CS↑	CC12M-ZFID↓
GALIP(paper)	5.85	0.3338	12.54
GALIP(released)	5.01	0.3379	12.54

Sampling

Synthesize images from your text descriptions

the sample.ipynb can be used to sample

Citing GALIP

If you find GALIP useful in your research, please consider citing our paper:


@inproceedings{tao2023galip,
  title={GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis},
  author={Tao, Ming and Bao, Bing-Kun and Tang, Hao and Xu, Changsheng},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={14214--14223},
  year={2023}
}

The code is released for academic research use only. For commercial use, please contact Ming Tao (陶明) (mingtao2000@126.com).

Reference

DF-GAN: A Simple and Effective Baseline for Text-to-Image Synthesis [code]