Home

Awesome

HallE-Control: Controlling Object Hallucination in Large Mutimodal Models

[Paper] [Project Page] <br> Bohan Zhai*, Shijia Yang*, Chenfeng Xu, Sheng Shen, Kurt Keutzer, Chunyuan Li, Manling Li

Release

Contents

Install

  1. Clone this repository and navigate to HallE_Control folder
git clone https://github.com/bronyayang/HallE_Control.git
cd HallE_Control
  1. Install Package
conda create -n halle python=3.10 -y
conda activate halle
bash scripts/run.sh

Training

  1. Prepare data

Follow LLaVA to prepare data. Download controller data here and put in ./data folder.

  1. Start training
bash scripts/v1_5/tune_controller.sh

Make sure the output_dir contains the word "controller" for correct inference behavior.

bash scripts/v1_5/finetune_indication.sh

Evaluation

Here, we provide the procedure of evaluating any model on CCEval.

  1. Download VisualGenome images part 1, part 2, and objects; unzip to ./data folder

  2. Obtain captions of your model for 100 VG images. For example, you can obtain captions of controller model by using the following script:

bash scripts/v1_5/model_control_eval.sh
  1. Get CCEval results (without coverage) by running:
python3 cceval.py --cap_file [YOUR_CAPTION_FILE_PATH] --key [YOUR_OPENAI_API_KEY]

Citation

If you find HallE-Control useful for your research and applications, please cite using this BibTeX:


@misc{zhai2023halleswitch,
      title={HallE-Switch: Controlling Object Hallucination in Large Vision Language Models}, 
      author={Bohan Zhai and Shijia Yang and Chenfeng Xu and Sheng Shen and Kurt Keutzer and Manling Li},
      year={2023},
      eprint={2310.01779},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}