Home

Awesome

RegionDrag: Fast Region-Based Image Editing with Diffusion Models (ECCV 2024)

Jingyi Lu<sup>1</sup>, Xinghui Li<sup>2</sup>, Kai Han<sup>1</sup><br> <sup>1</sup>Visual AI Lab, The University of Hong Kong; <sup>2</sup>Active Vision Lab, University of Oxford

Open In Colab <a href="https://visual-ai.github.io/regiondrag"><img alt='page' src="https://img.shields.io/badge/Project-Website-orange"></a> <a href="https://arxiv.org/abs/2407.18247"><img alt='arXiv' src="https://img.shields.io/badge/arXiv-2407.18247-b31b1b.svg"></a> <a href="https://drive.google.com/file/d/1rdi4Rqka8zqHTbPyhQYtFC2UdWvAeAGV/view?usp=sharing"><img alt='data' src="https://img.shields.io/badge/Download-Dataset-green.svg"></a>

<table> <tr> <td><img src="assets/time.png" width="450" alt="Time"></td> <td><img src="assets/pipe.png" width="500" alt="Pipe"></td> </tr> </table>

RegionDrag proposes to use pairs of regions instead of points (e.g. DragGAN, DragDiffusion) to drag image contents. Visit our project page for various region input examples.

During inference, the SD latent representations of the inputted image are extracted from πŸ”΄ RED regions during inversion and mapped to πŸ”΅ BLUE regions during denoising across multiple timesteps.

Installation

CUDA support is required to run our code, you can try our Colab Demo for easy access to GPU resource. <br> To locally install RegionDrag, run following using terminal:

git clone https://github.com/LuJingyi-John/RegionDrag.git
cd RegionDrag
pip install -r requirements.txt

# Support InstantDrag Backbone Now (https://github.com/SNU-VGILab/InstantDrag)
git clone https://github.com/SNU-VGILab/InstantDrag instantdrag

Run RegionDrag

After installing the requirements, you can simply launch the user inferface through:

python3 ui.py

For detailed instructions to use our UI, check out our User Guide.

DragBench-SR & DragBench-DR

To evaluate region-based editing, we introduce DragBench-SR and DragBench-DR (R is short for 'Region’), which are modified versions of DragBench-S (100 samples) and DragBench-D (205 samples). These benchmarks are consistent with their point-based counterparts but use regions instead of points to reflect user intention. You can download the dataset HERE.

drag_data/
β”œβ”€β”€ dragbench-dr/
β”‚   β”œβ”€β”€ animals/
β”‚   β”‚   β”œβ”€β”€ JH_2023-09-14-1820-16/
β”‚   β”‚   β”‚   β”œβ”€β”€ original_image.png
β”‚   β”‚   β”‚   β”œβ”€β”€ user_drag.png
β”‚   β”‚   β”‚   β”œβ”€β”€ meta_data.pkl
β”‚   β”‚   β”‚   └── meta_data_region.pkl
β”‚   β”‚   └── ...
β”‚   └── ...
└── dragbench-sr/
    β”œβ”€β”€ art_0/
    β”‚   β”œβ”€β”€ original_image.png
    β”‚   β”œβ”€β”€ user_drag.png
    β”‚   β”œβ”€β”€ meta_data.pkl
    β”‚   └── meta_data_region.pkl
    └── ...

meta_data.pkl or meta_data_region.pkl include user interaction metadata in a dictionary format:

{
    'prompt': text_prompt describing output image,
    'points': list of points [(x1, y1), (x2, y2), ..., (xn, yn)], 
              handle points: (x1,y1), (x3,y3), ..., target points: (x2,y2), (x4,y4), ...,
    'mask': a binary mask specifying editing area,
}

BibTeX

@inproceedings{lu2024regiondrag,
  author    = {Jingyi Lu and Xinghui Li and Kai Han},
  title     = {RegionDrag: Fast Region-Based Image Editing with Diffusion Models},
  booktitle = {European Conference on Computer Vision (ECCV)},
  year      = {2024},
}

Related links

Acknowledgement

Insightful discussions with Cheng Silin and Huang Xiaohu were instrumental in refining our methodology. The intuitive layout of the DragDiffusion project inspired our user interface design. Our SDE scheduler implementation builds upon the groundbreaking work by Shen Nie et al. in their SDE-Drag project.