Home

Awesome

<div align=center> <img src="assets/icon.png" width=60%></div> <div align="center"> <h3>LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion</h3>

Pancheng Zhao<sup>1,2</sup> · Peng Xu<sup>3+</sup> · Pengda Qin<sup>4</sup> · Deng-Ping Fan<sup>2,1</sup> · Zhicheng Zhang<sup>1,2</sup> · Guoli Jia<sup>1</sup> · Bowen Zhou<sup>3</sup> · Jufeng Yang<sup>1,2</sup>

<sup>1</sup> VCIP & TMCC & DISSec, College of Computer Science, Nankai University

<sup>2</sup> Nankai International Advanced Research Institute (SHENZHEN· FUTIAN)

<sup>3</sup> Department of Electronic Engineering, Tsinghua University · <sup>4</sup>Alibaba Group

<sup>+</sup>corresponding authors

CVPR 2024

<a href="http://arxiv.org/abs/2404.00292"><img src='https://img.shields.io/badge/arXiv-LAKE RED-red' alt='Paper PDF'></a> <a href=''><img src='https://img.shields.io/badge/Official Version-LAKE RED-blue' alt='Project Page'></a> <a href='https://zhaopancheng.top/publication/LAKERED_CVPR24'><img src='https://img.shields.io/badge/Project_Page-LAKE RED-green' alt='Project Page'></a>

</div> <div align=center> <img src="assets/GIF_b_small.gif" width=250/><img src="assets/GIF_c_small.gif" width=250/> </div>

1. News

2. Get Start

1. Requirements

If you already have the ldm environment, please skip it

A suitable conda environment named ldm can be created and activated with:

conda env create -f ldm/environment.yaml
conda activate ldm

2. Download Datasets and Checkpoints.

Datasets:

We collected and organized the dataset LAKERED from existing datasets. The training set is from COD10K and CAMO, and testing set is including three subsets: Camouflaged Objects (CO), Salient Objects (SO), and General Objects (GO).

DatasetsGoogleDriveBaiduNetdisk(v245)
Results:

The results of this paper can be downloaded at the following link:

ResultsGoogleDriveBaiduNetdisk(berx)
Checkpoint:

The Pre-trained Latent-Diffusion-Inpainting Model

Pretrained Autoencoding ModelsLink
Pretrained LDMLink

Put them into specified path:

Pretrained Autoencoding Models: ldm/models/first_stage_models/vq-f4-noattn/model.ckpt
Pretrained LDM: ldm/models/ldm/inpainting_big/last.ckpt

The Pre-trained LAKERED Model

LAKEREDGoogleDriveBaiduNetdisk(dzi8)

Put it into specified path:

LAKERED: ckpt/LAKERED.ckpt

3. Quick Demo:

You can quickly experience the model with the following commands:

sh demo.sh

4. Train

4.1 Combine the codebook with Pretrained LDM
python combine.py
4.2 Start Train

You can change the `config_LAKERED.yaml' files to modify settings.

sh train.sh

Note:The solution to the KeyError 'global_step'

Quick fix : You can --resume with the model that is saved during termination from error. (logs/checkpoints/last.ckpt)

You can also skip 4.1 and download the LAKERED_init.ckpt to start training.

5. Test

Generate camouflage images with foreground objects in the test set:

sh test.sh

Note that this will take a lot of time, you can download the results.

6. Eval

Use torch-fidelity to calculate FID and KID:

pip install torch-fidelity

You need to specify the result root and the data root, then eval it by running:

sh eval.sh

For the “RuntimeError: stack expects each tensor to be equal size”

This is due to inconsistent image sizes.

Debug by following these steps:

​ (1) Find the datasets.py in the torch-fidelity

anaconda3/envs/envs-name/lib/python3.8/site-packages/torch_fidelity/datasets.py

​ (2) Import torchvision.transforms

import torchvision.transforms as TF

​ (3) Revise line 24:

self.transforms = TF.Compose([TF.Resize((299,299)),TransformPILtoRGBTensor()]) if transforms is None else transforms

Or you can manually modify the size of the images to be the same.

Contact

If you have any questions, please feel free to contact me:

zhaopancheng@mail.nankai.edu.cn

pc.zhao99@gmail.com

Citation

If you find this project useful, please consider citing:

@inproceedings{zhao2024camouflaged,
      author = {Zhao, Pancheng and Xu, Peng and Qin, Pengda and Fan, Deng-Ping and Zhang, Zhicheng and Jia, Guoli and Zhou, Bowen and Yang, Jufeng},
      title = {LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion},
      booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      year = {2024},
}

Acknowledgements

This code borrows heavily from latent-diffusion-inpainting, thanks the contribution of nickyisadog