Awesome
EaTR (ICCV 2023)
This repository provides the official PyTorch implementation of the ICCV 2023 paper:
<p align="center"> <img src="model_overview.png"/> </p>Knowing Where to Focus: Event-aware Transformer for Video Grounding [arXiv]<br> Jinhyun Jang, Jungin Park, Jin Kim, Hyeongjun Kwon, Kwanghoon Sohn<br> Yonsei University
Prerequisites
<b>0. Clone this repo.</b>
<b>1. Install dependencies.</b>
We trained and evaluated our models with Python 3.7 and PyTorch 1.12.1.
# create conda env
conda create --name eatr python=3.7
# activate env
conda actiavte eatr
# install pytorch
conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch
# install other python packages
pip install tqdm ipython easydict tensorboard tabulate scikit-learn pandas
<b>2. Prepare datasets.</b>
Download and extract each features under '../data/${dataset}/features/' directory.<br> The files are organized in the following manner:
EaTR
├── data
│ ├── qvhighlights
│ │ ├── *features
│ │ ├── highlight_{train,val,test}_release.jsonl
│ │ └── subs_train.jsonl
│ ├── charades
│ │ ├── *features
│ │ └── charades_sta_{train,test}_tvr_format.jsonl
│ └── activitynet
│ ├── *features
│ └── activitynet_{train,val_1,val_2}.jsonl
├── models
├── utils
├── scripts
├── README.md
├── train.py
└── ···
Training
Training can be launched by running the following command:
bash eatr/scripts/train.sh
Inference
Once the model is trained, you can use the following command for inference:
bash eatr/scripts/inference.sh ${path-to-ckeckpoint} ${split-name}
${split-name} can be one of val
and test
.
Citation
@inproceedings{Jang2023Knowing,
title={Knowing Where to Focus: Event-aware Transformer for Video Grounding},
author={Jang, Jinhyun and Park, Jungin and Kim, Jin and Kwon, Hyeongjun and Sohn, Kwanghoon},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
year={2023}
}