Home

Awesome

Paint-by-Sketch

Paper

<!-- <br> -->

Kangyeol Kim, Sunghyun Park, Junsoo Lee and Jaegul Choo.

<!-- <br> -->

Teaser

Teaser

Multi-backgrounds

Multi-backgrounds

Multi-references

Multi-references

Abstract

Recent remarkable improvements in large-scale text-to-image generative models have shown promising results in generating high-fidelity images. To further enhance editability and enable fine-grained generation, we introduce a multi-input-conditioned image composition model that incorporates a sketch as a novel modal, alongside a reference image. Thanks to the edge-level controllability using sketches, our method enables a user to edit or complete an image sub-part with a desired structure (i.e., sketch) and content (i.e., reference image). Our framework fine-tunes a pre-trained diffusion model to complete missing regions using the reference image while maintaining sketch guidance. Albeit simple, this leads to wide opportunities to fulfill user needs for obtaining the in-demand images.Through extensive experiments, we demonstrate that our proposed method offers unique use cases for image manipulation, enabling user-driven modifications of arbitrary scenes.

Environment & Pre-trained models

Dependancies

$ conda env create -f environment.yaml
$ conda activate paint_sketch
$ pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
$ pip install opencv-python==4.6.0.66 opencv-python-headless==4.6.0.66 matplotlib==3.2.2 streamlit==1.14.1 streamlit-drawable-canvas==0.9.2
$ pip install git+https://github.com/openai/CLIP.git

Download checkpoints

Paint-by-Sketch
    pretrained_models/
        model-modified-12channel.ckpt        
    models/
        Cartoon_v1_aesthetic/
            ...
    ...

Data preparation

bash preprocess_dataset/run_preprocess.sh <path/to/image_root> <gpu_id>
# e.g., 
# bash preprocess_dataset/run_preprocess.sh /home/nas2_userF/kangyeol/Project/webtoon2022/Paint-by-Sketch/samples 7
IMAGE_ROOT
    images/
        000000.png
        000001.png
        ...
    sketch_bin/
        000000.png
        000001.png
        ...
    sketch(Not used)/
        ... 
    ...

Training

bash cartoon_train.sh <gpu_ids> <path/to/logdir> <path/to/config>

# e.g,
# bash cartoon_train.sh 0,1 models/test configs/v1_aesthetic_sketch_image.yaml

Demo

  1. Running streamlit server
streamlit run demo/app.py --server.port=8507 --server.fileWatcherType none
  1. Upload the source image
<p align="center"> <img src="asset/1_load_image.png"> </p>
  1. Draw mask and sketch separately
<p align="center"> <img src="asset/2_draw_mask_sketch.png"> </p>
  1. Upload a reference image.
<p align="center"> <img src="asset/3_load_exemplar.png"> </p>
  1. Inference and export
<p align="center"> <img src="asset/4_inference_export.png"> </p>

Issues

Citation

@misc{kim2023referencebased,
    title={Reference-based Image Composition with Sketch via Structure-aware Diffusion Model},
    author={Kangyeol Kim and Sunghyun Park and Junsoo Lee and Jaegul Choo},
    year={2023},
    eprint={2304.09748},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

License

The code in this repository is released under the MIT License.

Acknowledges

This code borrows heavily from Stable Diffusion and Paint-by-Example.