Home

Awesome

Learning Geometry-aware Representations by Sketching (CVPR 2023)

arXiv

Hyundo Lee*, Inwoo Hwang, Hyunsung Go, Won-Seok Choi, Kibeom Kim, and Byoung-Tak Zhang <br> (AI Institute, Seoul National University)

This repository contains the pytorch code for reproducing our paper "Learning Geometry-aware Representations by Sketching".

<p align="center"; style="color:blue"> <img src="md_images/fig1.svg" style="width:50%; margin-right:40px"> $\;$ <img src="md_images/overlayed.png" width="18%"> </p>

At a high level, our model learns to abstract an image into a stroke-based color sketch that accurately reflects the geometric information (e.g., position, shape, size). Our sketch consists of a set of strokes represented by a parameterized vector that specifies their curvature, color, and thickness. We use these parameterized vectors directly as a compact representation of an image.

Overview

An overview of LBS(Learning by Sketching), including a CNN-based encoder, Transformer-based Stroke Generator, a Stroke Embedding Network and a Differentiable Rasterizer.

For training, we use CLIP-based perceptual loss, a guidance stroke from optimization-based generation (CLIPasso). You can optionally train with an additional loss function specified by the --embed_loss argument (choices=['ce', 'simclr', 'supcon']).

Dependencies

Currently, the following environment has been confirmed to run the code:

Install dependencies via pip.

# Clone the repo:
git clone https://github.com/illhyhl1111/LearningBySketching.git LBS
cd LBS

# Install dependencies:
pip install -r requirements.txt 
## If pytorch not installed:
pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html

# TODO
git clone https://github.com/BachiLi/diffvg
git submodule update --init --recursive
cd diffvg
python setup.py install

Preparing datasets

By default, the code assumes that all the datasets are located under ./data/. You can change this path by specifying --data_root.

--data_dir
├── clevr
│   ├── questions
│   ├── scenes
│   ├── images
│   │   ├── val
│   │   ├── train
│   │   │   ├── CLEVR_train_xxxxxx.png
│   │   │   └── ...
│   │   └── test
│   └── README.txt
└── geoclidean
    ├── constraints
    └── elements
        ├── train
        │   ├── triangle
        │   │   ├── in_xxx_fin.png
        │   │   └── ...
        │   └── ...
        └── test

Scripts

Generate guidance stroke

To train our model with the CLEVR and STL-10 datasets, you must first generate guidance strokes.

  1. Downloading pre-generated strokes from:

    put the downloaded files into gt_sketches/

  2. Generating the guidance strokes with:

sudo apt install python3-dev cmake
git clone https://github.com/BachiLi/diffvg
cd diffvg
git submodule update --init --recursive
python setup.py install
cd ../
python generate_data.py --config_path config/stl10.yaml --output_dir ./gt_sketches/stl10_train+unlabeled/ --dataset stl10_train+unlabeled --data_root /your/path/to/dir --visualize --device cuda
python merge_data.py --output_file ./gt_sketches/path_stl10.pkl --data_files ./gt_sketches/stl10_train+unlabeled/data_* --maskarea_files ./gt_sketches/stl10_train+unlabeled/maskareas_*
python generate_data.py --config_path config/clevr.yaml --output_dir ./gt_sketches/clevr_train --dataset clevr_train --data_root /your/path/to/dir --num_generation 10000 --visualize --device cuda
python merge_data.py --output_file ./gt_sketches/path_clevr.pkl --data_files ./gt_sketches/clevr_train/data_* --maskarea_files ./gt_sketches/clevr_train/maskareas_*

The execution of generate_data.py can be splited into multiple chunks with --chunk (num_chunk) (chunk_idx) options

python generate_data.py ... --chunk 2 0
python generate_data.py ... --chunk 2 1

Train LBS

# Rotated MNIST
python main.py --data_root /your/path/to/dir --config_path config/rotmnist.yaml

# Geoclidean-Elements
python main.py --data_root /your/path/to/dir --config_path config/geoclidean_elements.yaml 
# Geoclidean-Constraints
python main.py --data_root /your/path/to/dir --config_path config/geoclidean_constraints.yaml 

# CLEVR
python main.py --data_root /your/path/to/dir --config_path config/clevr.yaml 

# STL-10
python main.py --data_root /your/path/to/dir --config_path config/stl10.yaml 

Note: More than 30G of GPU memory is required to run the settings within the default configuration for CLEVR and STL-10. Utilzing multi-GPU with DDP is not currently supported.
If you run out of memory, we recommend changing --clip_model_name to RN50, reducing --num_aug_clip to reduce the amount of memory used by the CLIP model, or reducing the batch size, but performance may be degraded.

Optional arguments:

Evaluation

python evaluate.py logs/{dataset}/{target_folder}

Citation

If you make use of our work, please cite our paper:

@inproceedings{lee2023learning,
  title={Learning Geometry-aware Representations by Sketching},
  author={Lee, Hyundo and Hwang, Inwoo and Go, Hyunsung and Choi, Won-Seok and Kim, Kibeom and Zhang, Byoung-Tak},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={23315--23326},
  year={2023}
}