Home

Awesome

Self-Supervised-Sketch2Image-pytorch

A pytorch implementation of self-supervised sketch-to-image model, the paper can be found here.

0. Data

For CelebA and WikiArt paintings image, the pre-processed RGB image data and their corresponding sketch images are available at this link

1. Description

The code is structured as follows:

2. How to run

2.1 Synthesis sketches with TOM

To train TOM for synthesizing sketches for an RGB-image dataset, put all your RGB-images in a folder, and place all you rcollected sketches into another folder

cd sketch_styletransfer
python train.py --path_a /path/to/RGB-image-folder --path_b /path/to/real-sketches

You can also see all the training options by:

python train.py --help

The code will automatically create a new folder to store all the trained checkpoints and intermediate synthesis results.

Once finish training, you can generate sketches for all RGB images in your dataset use the saved checkpoints:

python evaluate.py --path_content /path/to/RGB-image-folder --path_result /your/customized/path/for/synthesized-sketches --checkpoint /choose/any/checkpoint/you/like

2.2 Train the Sketch-to-image model

Training the main model is as simple as

python train.py 

This code will automatically conduct the training of both the AutoEncoder and GAN (first train an AE then for a GAN, as described in the paper). A folder is automatically created and save intermediate checkpoints and generated images.

The benchmarking FID on training set is also printed in the terminal every fix amount of iterations.

2.2.1 Dataset

This code is ready to train on your own image datasets. And training on datasets used in the model (CelebA and WikiArt) or on your own datasets are the same: just place all images from a dataset into one folder.

2.2.2 Config training

To config the training, just edit the config.py. Note that there are three available "DATA_NAME" to choose from: "art", "face" and "shoe". "art" is the specified optimal structure for WikiArt dataset, "face" is for CelebA, and "shoe" is for small datasets with only few thousand RGB-images (or even less). When train on your own dataset, you can try with the three options and see which one works the best. Note that the "shoe" option will employ an extra training objective for the AutoEncoder to enforce the effectiveness of the Content-Encoder.

2.3 Evaluation

Running benchmarks is as simple as

python benchmark.py

Which you should specify the model path and image folder path inside the code.

We also provide code for generating images as we displayed in the paper, including generating style-transfer, style-mixing and sketch-to-image results. All the code are located in folder "evaluate".

3. Extra notes

The provided code is for research use only, and is a simplified version from what described in the paper. Only few changes are made due to business concerns (we developed this model for commercial use).

Despite the changes, the code still able to train a model that beats state-of-the-art models just as we claimed in the paper, and only slightly worse than our fully-flaged version. Importantly, we believe one can easily re-implement the ommitted parts mentioned above.