Home

Awesome

Idea2Img <img src="./icon.png" width="5%"/>

Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation

by Zhengyuan Yang, Jianfeng Wang, Linjie Li, Kevin Lin, Chung-Ching Lin, Zicheng Liu, and Lijuan Wang

Introduction

Built upon GPT-4V(ision), Idea2Img is a multimodal iterative self-refinement system that enhances any T2I model for automatic image design and generation, enabling various new image creation functionalities togther with better visual qualities.

<p align="center"> <img src="https://idea2img.github.io/images/teaser.png" width="75%"/> </p>

Prerequisites

Installation

  1. Clone the repository

    git clone https://github.com/zyang-ur/idea2img.git
    

Running

  1. Inference prompts will be read from --testfile. <IMG> is a separator token inserted between image-image and image-text.

    mkdir output
    python idea2img_pipeline.py --api_key OAI_GPT4V_Key --testfile testsample.txt --fewshot --select_fewshot
    

Results

  1. Generated results and intermediate steps will be saved to output folder.
<p align="center"> <img src="./main_de3.png" width="75%"/> </p>

Citation

@article{yang2023idea2img,
  title={Idea2img: Iterative self-refinement with gpt-4v (ision) for automatic image design and generation},
  author={Yang, Zhengyuan and Wang, Jianfeng and Li, Linjie and Lin, Kevin and Lin, Chung-Ching and Liu, Zicheng and Wang, Lijuan},
  journal={arXiv preprint arXiv:2310.08541},
  year={2023}
}