Home

Awesome

A Joint Sequence Fusion Model for Video Question Answering and Retrieval

This project hosts the tensorflow implementation for our ECCV 2018 paper, A Joint Sequence Fusion Model for Video Question Answering and Retrieval}.

Reference

If you use this code or dataset as part of any published research, please refer the following paper.

@inproceedings{
  author    = {Youngjae Yu and Jongseok Kim and Gunhee Kim},
  title     = "{A Joint Sequence Fusion Model for Video Question Answering and Retrieval}"
  booktitle = {ECCV},
  year      = 2018
}

Setup

Install dependencies

pip install -r requirements.txt

Setup python paths

git submodule update --init --recursive
add2virtualenv .

Prepare Data

Training

Modify configuartion.py to suit your environment.

Run train.py.

python train.py --tag="tag"

Pretrained Model

You can download the models and features in gDrive Link Modify 'configuration.py' to load the checkpoints (self.load_from_ckpt = 'path/to/checkpoint/')

[RET] R@1: 93, R@5: 247, R@10: 348, medr : 29
[FIB] Accuracy: 45.1

You can get slightly lower or higher performance from these scores.