Awesome
StylerDALLE
Official PyTorch implementation for ICCV 2023 paper StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Model.
Updates
17 Jul 2023: StylerDALLE is accepted by ICCV 2023.
16 Aug 2023: StylerDALLE is also accepted for a presentation at the MMFM Workshop @ICCV, see you in Paris.
13 Sep 2023: Training code for StylerDALLE-1 and StylerDALLE-Ru released.
Setup:
Environment:
conda env create -f environment.yml
conda activate stylerdalle
Usage:
Data Pre-Processing
Before training, we preprocess the COCO dataset. Specifically, we encode the images into discrete tokens with a pretrained vector-quantized tokenizer. First of all, you shall download the images (train-2014, val-2014) and the annotations.
Preprocess for StylerDALLE-1 with the officially released dVAE of DALL-E:
python prep/prep_coco.py
Preprocess for StylerDALLE-Ru with the VQGAN of Ru-DALLE:
python prep/prep_coco_ru.py
In addition, for reinforcement learning, we prepare the caption data files here. They are derived from the original coco annotations but contain only successfully preprocessed image data.
Train StylerDALLE-1
-
Self-supervised Pre-training:
python train.py
-
Reinforcement Learning :
python train_rl.py --styl 'a Van Gogh style oil painting'
Train StylerDALLE-Ru
-
Self-supervised Pre-training:
python train_ru.py
-
Reinforcement Learning :
python train_ru_rl.py --styl 'a Van Gogh style oil painting'
Reference
@InProceedings{Xu2023StylerDALLE,
author = {Xu, Zipeng and Sangineto, Enver and Sebe, Nicu},
title = {StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Model},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month = {October},
year = {2023},
pages = {7601-7611}
}