Home

Awesome

Predicting Human Scanpaths in Visual Question Answering

This code implements the prediction of human scanpaths in three different tasks:

Reference

If you find the code useful in your research, please consider citing the paper.

@InProceedings{xianyu:2021:scanpath,
    author={Xianyu Chen and Ming Jiang and Qi Zhao},
    title = {Predicting Human Scanpaths in Visual Question Answering},
    booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2021}
}

Disclaimer

For the ScanMatch evaluation metric, we adopt the part of GazeParser package. We adopt the implementation of SED and STDE from VAME as two of our evaluation metrics mentioned in the Visual Attention Models. Based on the checkpoint implementation from updown-baseline, we slightly modify it to accommodate our pipeline.

Requirements

$ conda env create -f sp_baseline.yml

to create the same environment where we successfully run our codes.

Tasks

We provide the corresponding codes for the aforementioned three different tasks on three different datasets.

We would provide more details for these tasks in their corresponding folders.