Home

Awesome

Vision-LLMs Can Fool Themselves with Self-Generated Typographic Attacks

Official implementation of Vision-LLMs Can Fool Themselves with Self-Generated Typographic Attacks.

TODO

Setup

Prepare dataset.

For each dataset, set the right path at dataset_eval.py and dataset_eval_gpt4.py

Model Setup

To setup

Eval.

To evaluate LLaVA/InstructBlip/MiniGPT-4, run:

python dataset_eval.py --model [llava/blip/minigpt4] --method [Method] --dataset [Dataset]

To evaluate GPT-4, first set your api key at utils_models/utils_gpt4.py, and then run:

python dataset_eval_gpt4.py --method [Method] --dataset [Dataset]

Citation

If you find this repository useful please give it a star and cite as follows! :) :

    @article{qraitem2024vision,
    title={Vision-LLMs Can Fool Themselves with Self-Generated Typographic Attacks},
    author={Qraitem, Maan and Tasnim, Nazia and Saenko, Kate and Plummer, Bryan A},
    journal={arXiv preprint arXiv:2402.00626},
    year={2024}
    }