Home

Awesome

HR-Pro: Point-supervised Temporal Action Localization via Hierarchical Reliability Propagation (AAAI24)

<p align="center"> <img src="assets/HR-Pro.png" > </p>

HR-Pro: Point-supervised Temporal Action Localization via Hierarchical Reliability Propagation

Huaxin Zhang, Xiang Wang, Xiaohao Xu, Zhiwu Qing, Changxin Gao, Nong Sang

Abstract: Point-supervised Temporal Action Localization (PSTAL) is an emerging research direction for label-efficient learning. However, current methods mainly focus on optimizing the network either at the snippet-level or the instance-level, neglecting the inherent reliability of point annotations at both levels. In this paper, we propose a Hierarchical Reliability Propagation (HR-Pro) framework, which consists of two reliability-aware stages: Snippet-level Discrimination Learning and Instance-level Completeness Learning, both stages explore the efficient propagation of high-confidence cues in point annotations. For snippet-level learning, we introduce an online-updated memory to store reliable snippet prototypes for each class. We then employ a Reliability-aware Attention Block to capture both intra-video and inter-video dependencies of snippets, resulting in more discriminative and robust snippet representation. For instance-level learning, we propose a point-based proposal generation approach as a means of connecting snippets and instances, which produces high-confidence proposals for further optimization at the instance level. Through multi-level reliability-aware learning, we obtain more reliable confidence scores and more accurate temporal boundaries of predicted proposals. Our HR-Pro achieves state-of-the-art performance on multiple challenging benchmarks, including an impressive average mAP of 60.3% on THUMOS14. Notably, our HR-Pro largely surpasses all previous point-supervised methods, and even outperforms several competitive fully supervised methods.

🆕:Updates

📝:Results

The mean average precisions (mAPs) under the standard intersection over union (IoU) thresholds are reported. Please note that the results reported here differ slightly from those in the paper due to the influence of the random seed.

Dataset@0.1@0.2@0.3@0.4@0.5@0.6@0.7AVG(0.1:0.5)AVG(0.3:0.7)AVG(0.1:0.7)
THUMOS1485.181.173.964.053.140.525.371.451.460.4

📖:Installation

Recommended Environment

You can install them by pip install -r requirements.txt

Data Preparation

cd dataset/THUMOS14
tar -xzvf thumos_features.tar.gz

Please ensure the data structure is as below.

├── dataset
   └── THUMOS14
       ├── gt_full.json
       ├── split_train.txt
       ├── split_test.txt
       ├── point_labels
           └── point_gaussian.csv
       └── features
           ├── train
                ├── video_validation_0000051.npy
                ├── video_validation_0000052.npy
                └── ...
           └── test
                ├── video_test_0000004.npy
                ├── video_test_0000006.npy
                └── ...

🚗:Training and Testing

python main.py --cfg thumos --stage 1 --mode train
python main.py --cfg thumos --stage 1 --mode test
python main.py --cfg thumos --stage 2 --mode train
python main.py --cfg thumos --stage 2 --mode test
tensorboard --logdir=./ckpt

🛰️:References

We referenced the repos below for the code.

📑:Citation

If you find this repo useful for your research, please consider citing our paper:

@article{zhang2023hr,
  title={HR-Pro: Point-supervised Temporal Action Localization via Hierarchical Reliability Propagation},
  author={Zhang, Huaxin and Wang, Xiang and Xu, Xiaohao and Qing, Zhiwu and Gao, Changxin and Sang, Nong},
  journal={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2024}
}