Awesome

MureObjectStitch-Image-Composition

This is the technical report for MureObjectStitch, which has been integrated into our image composition toolbox libcom.

MureObjectStitch: Multi-reference Image Composition [arXiv]

MureObjectStitch is a simple extension of ObjectStitch to support multiple reference images of one foreground object. Generally, more reference images could lead to better results.

We release the pretrained model for MureObjectStitch, which can achieve good results for common or simple objects. However, the pretrained model is weak in keeping the object identity for the objects with rich details. If you have a few images containing the foreground object, we suggest finetuning MureObjectStitch using these images, which can greatly promote the detail preservation.

Note that in the reference images, the foreground object's length and width should fully extend to the edges of the image (see our example), otherwise the performance would be severely affected.

Get Started

1. Dependencies

torch==1.11.0
pytorch_lightning==1.8.1

install dependencies:

cd ObjectStitch-Image-Composition
pip install -r requirements.txt
cd src/taming-transformers
pip install -e .

2. Download the Pretrained Models

Please download the following files to the checkpoints folder to create the following file tree:

checkpoints/
├── ObjectStitch.pth
└── openai-clip-vit-large-patch14
    ├── config.json
    ├── merges.txt
    ├── preprocessor_config.json
    ├── pytorch_model.bin
    ├── tokenizer_config.json
    ├── tokenizer.json
    └── vocab.json

openai-clip-vit-large-patch14 (Huggingface | ModelScope).
ObjectStitch.pth (Huggingface | ModelScope).

3. Finetune on Examples

Please prepare your training data following the examples in finetuning_examples
You can use MureComDataset in ldm/data/open_images.py to load your data.

To finetune objectstitch, you can use main.py. For example,

python main.py \
  --logdir experiments/objectstitch \
  --name='car_multifg' \
  --num_workers 4 \
  --devices 1 \
  --batch_size 1 \
  --num_nodes 1 \
  --base configs/murecom.yaml \
  --package_name='Car' \
  --fg_name='fg1' \

or simply run:

sh finetune.sh

We finetune the model for 150 epochs by default. You can modify the training settings in configs/murecom.yaml. After finetuning, the checkpoints are saved in logdir. You can change the path of the checkpoints by modifying this argument.

4. Inference on Examples

Please refer to the examples folder for data preparation:
- Use the same filename for paired data (e.g., background and bbox, foreground and foreground_mask).
To perform image composition using the pretrained model or your finetuned model, you can use scripts/inference.py. For example,
```
python scripts/inference.py \
--outdir results \
--testdir examples/example1 \
--num_samples 5 \
--sample_steps 50 \
--gpu 0
```
or simply run:
```
sh test.sh
```
The images in the examples folder are collected from Murecom dataset. You can place your own images in the examples folder. If you want to use your own checkpoints, you need to set ckpt_dir.

5. Visualization Results

We showcase several example results generated by the pretrained model and the finetuned model on Murecom dataset. In each example, from left to right, we show the background image with bounding box to insert the foreground object, the reference images of foreground image, and 5 results using different random seeds. The results in odd rows are obtained using the pretrained model, and the results in even rows are obtained using the finetuned model.

We also provide more results of our MureObjectStitch on Murecom dataset through [Baidu Cloud] (code: 7jxd). In each image in the folder, from top to bottom, we show the results using the model finetuned for 50, 100, 150, 200 epochs. Finetuning 150 epochs can generally achieve satisfactory results. In some cases, finetuning more epochs (e.g., 200 epochs) is helpful for keeping more details, yet at the risk of distorted content and improper illumination. Finetuning 150 epochs takes about 15 minutes on a single A6000 GPU card.

In the figure below, we show some example results of our MureObjectStitch. In each example, from left to right, we show the background image with specified foreground placement, one example reference image of foreground object, and 5 results using different random seeds.

Citation

If you find this work or code is helpful in your research, please cite:

@article{mureobjectstitch,
  title={MureObjectStitch: Multi-reference Image Composition},
  author={Chen, Jiaxuan and Zhang, Bo and Niu, Li},
  journal={arXiv preprint arXiv:2411.07462},
  year={2024}
}

Other Resources

We summarize the papers and codes of generative image composition: Awesome-Generative-Image-Composition
We summarize the papers and codes of image composition from all aspects: Awesome-Image-Composition
We summarize all possible evaluation metrics to evaluate the quality of composite images: Composite-Image-Evaluation
We write a comprehensive survey on image composition: the latest version