Home

Awesome

LLMs as Visual Explainers: Advancing Image Classification with Evolving Visual Descriptions

Overview

This repository contains code to replicate key experiments from our paper LLMs as Visual Explainers: Advancing Image Classification with Evolving Visual Descriptions.

Installation

First install the dependencies.

pip install -r requirements.txt
pip install git+https://github.com/openai/CLIP.git

Follow DATASETS.md to prepate the datasets. After that, you can run the following commands to extract the image features of the datasets.

python feat_extractor.py --data_dir $DATA --split_dir $SPLIT --output_dir data/$DATASET_NAME

where $DATA is the path to the dataset, $SPLIT is the path to the split file, and $DATASET_NAME is the name of the dataset (e.g., imagenet, caltech, etc.). The extracted features will be saved to data/$DATASET_NAME.

You need to set up the Openai API key to use GPT-4. Please set the environment variable in your terminal:

export OPENAI_API_KEY= "YOUR_API_KEY"

Usage

To generate the class descriptions for a dataset using our method, you can run the following command:

python main.py --img_dir data/eurosat --label_dir data/eurosat/labels.txt

You can change more parameters in the main.py file.

To evaluate the generated descriptions, you can run the following command:

python eval.py --img_dir data/eurosat --label_dir result/Ours/eurosat_sota.txt

To reproduce the results in the paper (including baselines), you can run the commands in the replicate_key_results.sh file.

Citation

@article{han2023llms,
  title={LLMs as Visual Explainers: Advancing Image Classification with Evolving Visual Descriptions},
  author={Han, Songhao and Zhuo, Le and Liao, Yue and Liu, Si},
  journal={arXiv preprint arXiv:2311.11904},
  year={2023}
}