Awesome
On Enriching Image Captions by Fine-Tuning Large Vision-Language Models with Caption Rewrites
Illustration of image caption rewriting using ChatGPT.
In Stage 1, the Keyword Extraction Prompt instructs ChatGPT to generate verbs, nouns, and adjectives (highlighted in brown) from the original caption. In Stage 2, the Caption Generation Prompt guides ChatGPT to generate a rewritten caption. By iteratively applying this prompt, multiple rewritten captions can be generated.
python gen_augdata.py
Use different model
You need to first deploy the following models locally.
python {use_llava/owl/minigpt4}.py