Awesome
Hierarchical Atomic Action Network
This repo contains the code for the paper:
Li, Z., He, L., & Xu, H. (2022). Weakly-Supervised Temporal Action Detection for Fine-Grained Videos with Hierarchical Atomic Actions. In European Conference on Computer Vision (pp. 567-584). Springer, Cham.
Dependencies
The code is written and run with the following packages:
- Python 3.8.8
- PyTorch 1.7.1+cu110
- NumPy 1.20.1
- pandas 1.2.4
- scikit-learn 0.24.1
Data
FineGym-new-split/
contains the new training/validation split we proposed for FineGym 99, with the same format as the original split on the FineGym website.dataset/
contains the annotations for each dataset. You can find the original annotations from the FineGym and FineAction websites. We did some preprocessing and obtained the following files from the original annotations:fine_to_coarse_mappings.json
: a mapping from each fine-level class to its corresponding coarse-level class.video_names.npy
: the names of the videos.labels.npy
: the action labels within each video.segments.npy
: the action start/end timestamps within each video.sec2clip_ratios.npy
: the ratios to convert time in seconds to the corresponding clip index. For FineGym, we extract features with 24 fps and each clip has 16 consecutive frames, so the ratio is 24/16=1.5 for all videos. For FineAction, we use the features with a fixed 100 clips provided by the FineAction competition page, so the ratios vary across videos.
Instructions
Data Preparation
Put the extracted I3D features under dataset/FineAction/
and/or dataset/FineGym
and update features_path
in config/fine_action.toml
and/or config/fine_gym.toml
accordingly.
Features can be downloaded via Google Drive or Baidu Netdisk. We extracted FineGym features using I3D, and FineAction features are from the FineAction competition page. We use the i3d_100
version of the features.
Training
Run the following code, replacing DATASET
with FineAction
or FineGym
, EXP_NAME
with your experiment name, and OUTPUT_DIR
with the directory where you want to store the results.
python main.py --dataset DATASET --exp-name EXP_NAME --output-dir OUTPUT_DIR
After the run finishes, four models encoder.pkl, fine_level_classifier.pkl, pseudo_label_classifier.pkl, coarse_level_classifier.pkl
and one result file results.csv
will be saved under OUTPUT_DIR/DATASET/EXP_NAME
.
Evaluation
Run the following code, replacing DATASET
with FineAction
or FineGym
, and INPUT_MODELS_DIR
with the directory where your models are stored.
python main.py --dataset DATASET --evaluation-only --input-models-dir INPUT_MODELS_DIR
Make sure to have encoder.pkl
and fine_level_classifier.pkl
under your INPUT_MODELS_DIR
. The other two models pseudo_label_classifier.pkl
and coarse_level_classifier.pkl
are not needed for evaluation.
We also provide our pre-trained models under output/FineAction/pre-trained
and output/FineGym/pre-trained
.
References
We referenced the following repos for the code:
Citation
Please cite the following work if you use this package.
@inproceedings{li2022weakly,
title={Weakly-Supervised Temporal Action Detection for Fine-Grained Videos with Hierarchical Atomic Actions},
author={Li, Zhi and He, Lu and Xu, Huijuan},
booktitle={European Conference on Computer Vision},
pages={567--584},
year={2022},
organization={Springer}
}
Contact
If you have any questions, please contact the first author of the paper - Zhi Li (zhilicq@gmail.com).