Home

Awesome

🏞️ Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition

<a href="https://danielchyeh.github.io/Gen4Gen/"><img src="https://img.shields.io/static/v1?label=Project&message=Website&color=blue" height=20.5></a> <a href="https://arxiv.org/abs/2402.15504"><img src="https://img.shields.io/static/v1?label=Paper&message=Link&color=green" height=20.5></a> <a href=""><img src="https://img.shields.io/static/v1?label=Project&message=Video&color=red" height=20.5></a>

By Chun-Hsiao Yeh*, Ta-Ying Cheng*, He-Yen Hsieh*, Chuan-En Lin, Yi Ma, Andrew Markham, Niki Trigoni, H.T. Kung, Yubei Chen ( * equal contribution)

UC Berkeley, University of Oxford, Harvard University, CMU, HKU, UC Davis

tags: stable diffusion personalized text-to-image generation llm

This repo is the official implementation of "Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition".

<br> <div class="gif"> <p align="center"> <img src='assets/CVPR24-Gen4Gen-animation-HowItWorks.gif' align="center" width=800> </p> </div>

TL;DR: We introduce a dataset creation pipeline, Gen4Gen, to compose personal concept into realistic scenes with complex compositions, accompanied by detailed text descriptions.

πŸ“ Updates

πŸ”Ž Overview of This Repository

🎁 Prepare Personal Assets

Please prepare your personal images and put them under data/s0_source_images. Our personal images are from Unsplash. The structure of <a href="#2">πŸ—‚ data/s0_source_images</a><br> looks like this:

<details> <summary><a name="2"></a>πŸ—‚ Structure of data/s0_source_images </summary>
../data/s0_source_images
└── cat_dog_houseplant_3objs
    β”œβ”€β”€ cat
    β”‚Β Β  └── sergey-semin-agQhOHQipoE-unsplash.jpg
    β”œβ”€β”€ dog
    β”‚Β Β  β”œβ”€β”€ Copy of 5.jpeg
    β”‚Β Β  └── Copy of 6.jpeg
    └── houseplant
        β”œβ”€β”€ Copy of 1.png
        β”œβ”€β”€ Copy of 2.png
        β”œβ”€β”€ Copy of 3.png
        └── Copy of 5.png
└── [folder_of_other_scenes]
    β”œβ”€β”€ [object_name_1]
    β”‚Β Β  β”œβ”€β”€ [image_name_1.jpg]
    β”‚Β Β  β”œβ”€β”€ ...
    β”‚Β Β  └── [image_name_n.jpeg]
    ...
    └── [oject_name_n]
</details>

:world_map: <a name="3"></a> Environments

<details> <summary>Library versions</summary> </details>

:thumbsup: <a name="10"></a> Acknowledgement

Our codebase is built based on DIS, LLM-grounded Diffusion, SD-XL Inpainting, and custom-diffusion. We really appreciate the authors for the nicely organized code and fantastic works!

πŸ“¬ How to Get Support?

If you have any general questions or need support, please feel free to contact: Chun-Hsiao Yeh, Ta-Ying Cheng and He-Yen Hsieh. Also, we encourage you to open an issue in the GitHub repository. By doing so, you not only receive support but also contribute to the collective knowledge base for others who may have similar inquiries.

:heart: <a name="11"></a> Citation

If you find the codebase and MyCanvas dataset valuable and utilize it in your work, we kindly request that you consider giving our GitHub repository a ⭐ and citing our paper.

@misc{yeh2024gen4gen,
  author        = {Chun-Hsiao Yeh and
                   Ta-Ying Cheng and
                   He-Yen Hsieh and
                   David Chuan-En Lin and
                   Yi Ma and
                   Andrew Markham and
                   Niki Trigoni and
                   H.T. Kung and
                   Yubei Chen},
  title         = {Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition},
  year          = {2024},
  eprint        = {2402.15504},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CV}
}