Awesome
<h1 align="center"> <p>[ECCV 2024] Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing</p></h1> <h4 align="center"> <p> <a href="https://scholar.google.com/citations?user=Q-ARWkwAAAAJ&hl=eh" target="_blank">Wonjun Kang</a><sup>*</sup>, <a href="https://scholar.google.com/citations?user=G1EpeWYAAAAJ&hl=en" target="_blank">Kevin Galim</a><sup>*</sup>, <a href="http://cvml.ajou.ac.kr/wiki/index.php/Professor" target="_blank">Hyung Il Koo</a> </p> <p> <a href="https://furiosa.ai/" target="_blank">FuriosaAI</a> </p> </h4> <p align="center"> <a href="https://arxiv.org/abs/2403.09468"> <img alt="GitHub release" src="https://img.shields.io/badge/arXiv-2403.09468-b31b1b.svg"> </a> </p>Paper Link: https://arxiv.org/abs/2403.09468
Video: YouTube
Updates
- [03/15/24] Code released.
Usage
- Note, we tested the code on a NVIDIA V100 32GB GPU. On different GPUs, results might slightly differ.
Setup
- Install PyTorch (tested with Python 3.9 and PyTorch 1.13.1), e.g.,
conda create -n diffinv python=3.9 conda activate diffinv conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
- Install requirements
pip install -r requirements.txt
Demo
To run the provided Gradio demo run
python demo/run.py
and open http://localhost:7860/
in your browser.
Edit single image
To edit a single image, use edit_image.py
.
python edit_image.py --help
usage: edit_image.py [-h] --input INPUT --source_prompt SOURCE_PROMPT --target_prompt TARGET_PROMPT [--output OUTPUT] [--inv_method INV_METHOD] [--edit_method EDIT_METHOD] [--edit_cfg EDIT_CFG] [--scheduler {ddim,ddpm,dpm}] [--steps STEPS] [--guidance_scale_bwd GUIDANCE_SCALE_BWD] [--guidance_scale_fwd GUIDANCE_SCALE_FWD]
Edits a single image.
optional arguments:
-h, --help show this help message and exit
--input INPUT Path to image to invert.
--source_prompt SOURCE_PROMPT
Prompt to use for inversion.
--target_prompt TARGET_PROMPT
Prompt to use for inversion.
--output OUTPUT Path for output image.
--inv_method INV_METHOD
Available inversion methods:
diffinv Naiv DDIM inversion
nti Null text inversion
npi Negative prompt inversion
proxnpi Proximal negative prompt inversion
edict EDICT inversion
ddpminv DDPM inversion
dirinv Direct inversion
etainv Eta inversion
--edit_method EDIT_METHOD
Available editing methods:
simple Simple denoising of inverted latent with target prompt
ptp Prompt-to-prompt
masactrl MasaControl
pnp Plug-and-play
pix2pix_zero Pix2Pix zero
--edit_cfg EDIT_CFG Path to yaml file for editor configuration. Often needed for prompt-to-prompt.
--scheduler {ddim,ddpm,dpm}
Which scheduler to use.
--steps STEPS How many diffusion steps to use.
--guidance_scale_bwd GUIDANCE_SCALE_BWD
Classifier free guidance scale to use for backward diffusion (denoising).
--guidance_scale_fwd GUIDANCE_SCALE_FWD
Classifier free guidance scale to use for forward diffusion (inversion).
--prec {fp16,fp32} Precision for diffusion.
E.g., for prompt-to-prompt editing using Eta Inversion of test/data/house.png
with prompt "a house->monster in the woods"
, run
python edit_image.py \
--inv_method etainv \
--edit_method ptp \
--input test/data/house.png \
--source_prompt "a house in the woods" \
--target_prompt "a monster in the woods" \
--output out.png
Input | Output |
---|---|
Prepare datasets (for evaluation)
- PIE: Download from here and extract to
data/eval/PIE-Bench_v1
- Plug-and-Play
- Download from here and extract to
data/eval/plug_and_play
- Run
python scripts/convert_plug_and_play_imagenetr-ti2i.py python scripts/convert_plug_and_play_imagenetr-fake-ti2i.py
- Download from here and extract to
- ImagenHub: No setup needed
Evaluate
For evaluation prepare the dataset(s) above you want to test and create a config .yaml file inside cfg/eval
with the datasets, inversion methods and editing methods you want to evaluate. For a reference config file see cfg/eval/demo.yaml
. The evaluating script will run each combination given under data
, edit_method
and method
. E.g., for the config file cfg/eval/demo.yaml
, evaluation will run for (diffinv, ptp)
, (npi, ptp)
and (etainv, ptp)
. After preparing the config .yaml file use
python eval.py --help
usage: eval.py [-h] --cfg CFG [CFG ...] [--device DEVICE [DEVICE ...]] [--no_proc]
Run evaluation for the given config file. The result will be stored under result/{cfg_file_name}. For each combination of dataset, inversion and editing method in the config file, a separate directory will be created in result/{cfg_file_name}
optional arguments:
-h, --help show this help message and exit
--cfg CFG [CFG ...] Config file(s) for evaluation.
--device DEVICE [DEVICE ...]
Which cuda devices to use. Can be multiple (multiprocessing).
--no_proc Disables multiprocessing.
to perform editing and save all images under result/{cfg_file_name}
. E.g., if you want to use cfg/eval/demo.yaml
with one GPU, run
python eval.py --cfg cfg/eval/demo.yaml
All images will be stored under result/demo
.
Afterwards you can compute metrics on the output images using
python compute_metrics.py --help
usage: compute_metrics.py [-h] [--cfg CFG [CFG ...]] [--metric METRIC [METRIC ...]]
optional arguments:
-h, --help show this help message and exit
--cfg CFG [CFG ...] Config file(s) for evaluation.
--metric METRIC [METRIC ...]
Metric(s) to compute. If not specified, all metrics are computed.
The metrics will be stored as .yaml file in a folder named metrics
under each experiment directory. E.g., to compute metrics for cfg/eval/demo.yaml
run
python compute_metrics.py --cfg cfg/eval/demo.yaml
Finally you can visualize the computed metrics with notebooks/visualize_results.ipynb. Please see the notebook for further details.
Citation
@article{kang2024eta,
title={Eta Inversion: Designing an Optimal Eta Function for Diffusion-based Real Image Editing},
author={Kang, Wonjun and Galim, Kevin and Koo, Hyung Il},
journal={arXiv preprint arXiv:2403.09468},
year={2024}
}