Awesome
SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding -- ECCV2024
This is the implementation for the paper "SHINE: Saliency-aware HIerarchical NEgative Ranking for Compositional Temporal Grounding" (ECCV 2024): ArXiv version.
by
Zixu Cheng*<sup>1</sup>, Yujiang Pu*<sup>2</sup>, Shaogang Gong<sup>1</sup>, Parisa Kordjamshidi<sup>2</sup>, Yu Kong<sup>2</sup>
<sup>1</sup>Queen Mary University of London, <sup>2</sup> Michigan State University (* Equal Contribution)
Prerequisites
<b>0. Clone this repo</b>
<b>1. Prepare datasets</b>
<b>Charades-CG</b> : Download I3D feature files for Charades-CG dataset from VSLNet.
<b>ActivityNet-CG</b> : Download C3D feature files for ActivityNet-CG dataset from MS-2D-TAN.
Text Features : We provide our hierarchical negative query features here. (To be uploaded)
<b>2. Install dependencies.</b>
conda create -n shine python=3.10
conda activate shine
cd SHINE
pip install -r requirements.txt
Train
Charades-CG
- Add your data and feature path in shine/scripts/train_charades_cg.sh
######## setup video+text features
feat_root= # path/to/your/anet/features
- Run the script
bash shine/scripts/train_charades_cg.sh
ActivityNet-CG
- Add your data and feature path in shine/scripts/train_anet_cg.sh
######## setup video+text features
feat_root= # path/to/your/anet/features
- Run the script
bash shine/scripts/train_anet_cg.sh --clip_length 1 --saliency_margin 1.0 --max_es_cnt 10 --max_q_l 50 --enc_layers 3 --dec_layers 3 --lr 0.00013 --use_saliency_loss
Evaluation
# Evaluate Charades-CG
bash shine/scripts/inference_charades.sh path/to/your/ckpt 'val'
# Evaluate ActivityNet-CG
bash shine/scripts/inference_anet.sh path/to/your/ckpt 'val'
We also provide our checkpoints here. (To be uploaded)
Contributors and Contact
If there are any questions, feel free to contact the authors: Zixu Cheng (zixu.cheng@qmul.ac.uk), and Yujiang Pu (puyujian@msu.edu).
Acknowledgment
Our implementations are based on Moment-DETR and QD-DETR. We thank the authors for their awesome open-source contributions.
LICENSE
The annotation files are transformed from VISA and many parts of the implementations are borrowed from Moment-DETR and QD-DETR. Following, Our codes are also under the MIT license.