Home

Awesome

DGTS

This repository is the official code for the paper "Delving Globally into Texture and Structure for Image Inpainting" by Haipeng Liu (hpliu_hfut@hotmail.com), Yang Wang (corresponding author: yangwang@hfut.edu.cn), Meng Wang, Yong Rui. ACM Multimedia 2022, Lisbon, Portugal

Introduction

In this paper, we delve globally into texture and structure information to well capture the semantics for image inpainting. Unlike the current decoder-only transformer within the pixel level for image inpainting, our model adopts the transformer pipeline paired with both encoder and decoder. On one hand, the encoder captures the texture semantic correlations of all patches across image via self-attention module. On the other hand, an adaptive patch vocabulary is dynamically established in the decoder for the filled patches over the masked regions. Building on this, a structure-texture matching attention module (Eq.5 and 6) anchored on the known regions comes up to marry the best of these two worlds for progressive inpainting via a probabilistic diffusion process (Eq.8). Our model is orthogonal to the fashionable arts, such as Convolutional Neural Networks (CNNs), Attention and Transformer model, from the perspective of texture and structure information for image inpainting.

<p align="center">Figure 1. Illustration of the proposed transformer pipeline.</p>

In summary, our contributions are summarized below:

<p align="center">Figure 2. Intuition of the bridge module.</p>

Run

  1. Requirements
Python >= 3.6
PyTorch >= 1.0
NVIDIA GPU + CUDA cuDNN
scikit-image
scipy
opencv-python
matplotlib
  1. To train the proposed model described in the paper. Prepare training datasets and put them in ./data/places2/train, then run the following command:
Run "Python3 /DGTS/code/train/run_train.py"
  1. To inpaint the masked images, Prepare testing datasets and put them in ./data/places2/test, then run the following command:
Run "Python3 /DGTS/code/test/run_train.py"
  1. Please download the pre-trained model of Places2, CelebA or PSV.
  1. Following the previous work, the input images randomly match masks which adopted from the widely used irregular mask dataset to generate the masked images for training and testing in our paper.

Example Results

Citation

If any part of our paper and repository is helpful to your work, please generously cite with:

@inproceedings{liu2022delving,
  title={Delving Globally into Texture and Structure for Image Inpainting},
  author={Liu, Haipeng and Wang, Yang and Wang, Meng and Rui, Yong},
  booktitle={Proceedings of the 30th ACM International Conference on Multimedia},
  pages={1270--1278},
  year={2022}
}