Awesome
Idea2Img <img src="./icon.png" width="5%"/>
Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation
by Zhengyuan Yang, Jianfeng Wang, Linjie Li, Kevin Lin, Chung-Ching Lin, Zicheng Liu, and Lijuan Wang
Introduction
Built upon GPT-4V(ision), Idea2Img is a multimodal iterative self-refinement system that enhances any T2I model for automatic image design and generation, enabling various new image creation functionalities togther with better visual qualities.
<p align="center"> <img src="https://idea2img.github.io/images/teaser.png" width="75%"/> </p>Prerequisites
- Obtain the public OpenAI GPT-4V API key and setup T2I inference accordingly, e.g., SDXL.
Installation
-
Clone the repository
git clone https://github.com/zyang-ur/idea2img.git
Running
-
Inference prompts will be read from
--testfile
.<IMG>
is a separator token inserted between image-image and image-text.mkdir output python idea2img_pipeline.py --api_key OAI_GPT4V_Key --testfile testsample.txt --fewshot --select_fewshot
Results
- Generated results and intermediate steps will be saved to
output
folder.
Citation
@article{yang2023idea2img,
title={Idea2img: Iterative self-refinement with gpt-4v (ision) for automatic image design and generation},
author={Yang, Zhengyuan and Wang, Jianfeng and Li, Linjie and Lin, Kevin and Lin, Chung-Ching and Liu, Zicheng and Wang, Lijuan},
journal={arXiv preprint arXiv:2310.08541},
year={2023}
}