Home

Awesome

[ECCV'22] Language-Driven Artistic Style Transfer

A PyTorch implementation of LDAST

Paper | Project | Slide | Video

<img src='_imgs/ldast.png' width='75%' />

Overview

LDAST is an implementation of <br> "Language-Driven Artistic Style Transfer" <br> Tsu-Jui Fu, Xin Eric Wang, and William Yang Wang <br> in European Conference on Computer Vision (<b>ECCV</b>) 2022 <br>

<img src='_imgs/clva.png' width='60%' />

Language visual artist (LVA) extracts content structures from <I>C</I> and visual patterns from <I>X</I> to perform LDAST. LVA adopts the patch-wise style discriminator <I>D</I> to connect extracted visual semantics to patches of paired style image (<I>P<sub>S</sub></I>). Contrastive reasoning (CR) allows comparing contrastive pairs <I>C<sub>1</sub>-X<sub>1</sub></I>, <I>C<sub>2</sub>-X<sub>1</sub></I>, and <I>C<sub>2</sub>-X<sub>2</sub></I> of content image and style instruction.

Requirements

This code is implemented under Python 3.8, PyTorch 1.7, and Torchvision 0.8. <br>

Usage

Dataset

The dataset includes content images and visual attribute instructions (DTD). <br> Please visit WikiArt and here for emotional effect instructions (ArtEmis).

Train

Put sanet.pt in ./_ckpt and dtd.pkl in ./_data.

python train_lva.py
python train_ctr.py

Inference & GUI

Put clva_dtd.pt in ./_ckpt.

python inference.py
python gui.py

Citation

@inproceedings{fu2022ldast, 
  author = {Tsu-Jui Fu and Xin Eric Wang and William Yang Wang}, 
  title = {{Language-Driven Artistic Style Transfer}}, 
  booktitle = {European Conference on Computer Vision (ECCV)}, 
  year = {2022} 
}

Acknowledgement

This code is based on SANet