Awesome
FR-AGCN
Forward-reverse Adaptive Graph Convolutional Network for Skeleton-Based Action Recognition
Abstract
In this work, we propose the novel forward-reverse adaptive graph convolutional networks (FR-AGCN) for skeleton-based action recognition. The sequences of joints and bones, as well as their reverse information, are modeled in the multi-stream networks at the same time. By extracting the features of forward and reverse deep information and performing multi-stream fusion, this strategy can significantly improve the recognition accuracy.
This paper has been submitted to Neurocomputing. This work has been carefully revised based on the professional comments of the editors and reviewers.
Article available online: https://doi.org/10.1016/j.neucom.2021.12.054
Date | State |
---|---|
Sep 03, 2021 | manuscript submitted to journal |
Oct 06, 2021 | revised and reconsidered |
Oct 25, 2021 | revision submitted to journal |
Dec 07, 2021 | accepted with minor revision |
Dec 09, 2021 | revision submitted to journal |
Dec 13, 2021 | accepted |
Environment
PyTorch version >=0.4
Notes
We separate the three datasets for experiments: 'NTU RGB+D' & 'NTU RGB+D 120' & UAV-Human.
Here are some important notes:
- Please perform data preprocessing before training.
We set two parameters when using the interframe interpolation strategy for data augmentation, i.e., fu and S. The frame numbers of all samples are unified to fu first. Here we define the data segmentation factor S, which means that down-sampling is performed every S frames in the temporal dimension.
You can try to change these parameters to increase or decrease the size of the input data, which will have a certain impact on the final model performance.
If the memory is not enough, it is recommended to separate the benchmark for preprocessing. Moreover, it is recommended to set enough virtual memory.
We conducted detailed experiments on the CS benchmark of NTU 60. The results are as follows:
fu | S | FJ(%) | FB(%) | FJB(%) |
---|---|---|---|---|
600 | 1 | 86.85 | 86.81 | 88.88 |
600 | 2 | 87.44 | 87.68 | 89.29 |
600 | 3 | 86.02 | 87.06 | 88.77 |
600 | 4 | 85.93 | 86.69 | 88.34 |
300 | 1 | 86.42 | 86.79 | 88.86 |
300 | 2 | 85.73 | 85.98 | 88.23 |
Therefore, it is recommended to set fu=600 and S=2 to process three datasets. If the GPU memory is insufficient, please reduce the batchsize as appropriate during training.
- Please use the parameters saved in the training process to test before performing the multi-stream fusion operation.
You need to select the parameter file based on the training result and modify the test file in the config.
- For ease of description, we define single-stream and multi-stream networks according to input data. Specifically, for single-stream input, FJ-AGCN, RJ-AGCN, FB-AGCN, RB-AGCN indicate that the input to the AGCN are forward joints data, reverse joints data, forward bones data, reverse bones data, respectively. For multi-stream input, FR-AGCN represents the networks that integrates the above four single streams. Moreover, FJB-AGCN means that FJ-AGCN and FB-AGCN are terminally fused, FRJ-AGCN indicates that FJ-AGCN and RJ-AGCN are finally fused. RJB-AGCN, FRB-AGCN can be deduced by analogy.
Here, we compare the performance of using each type of input data separately and perform score fusion to obtain the final prediction. The results based on AGCN are shown as follows:
Methods | CS(%) | CV(%) | X-Sub(%) | X-Set(%) | CSv1(%) | CSv2(%) |
---|---|---|---|---|---|---|
FJ-AGCN | 87.44 | 94.08 | 81.23 | 81.57 | 40.08 | 65.66 |
RJ-AGCN | 87.78 | 94.00 | 81.23 | 82.14 | 39.23 | 63.40 |
FB-AGCN | 87.68 | 93.98 | 83.52 | 83.64 | 38.43 | 63.15 |
RB-AGCN | 88.03 | 93.66 | 83.44 | 83.66 | 38.86 | 63.75 |
FRJ-AGCN | 88.74 | 95.17 | 83.25 | 83.86 | 41.97 | 67.68 |
FRB-AGCN | 89.55 | 94.99 | 85.62 | 85.50 | 41.13 | 66.51 |
FJB-AGCN | 89.29 | 95.34 | 85.58 | 85.77 | 42.78 | 68.75 |
RJB-AGCN | 89.85 | 95.20 | 85.47 | 86.05 | 42.22 | 67.92 |
FR-AGCN | 90.46 | 95.83 | 86.60 | 86.99 | 43.98 | 69.50 |
Data Preparation
For 'NTU RGB+D':
-
Download the raw data from 'NTU-RGB+D' (https://rose1.ntu.edu.sg/dataset/actionRecognition/).
-
Then put them under the data directory:
-data\ -nturgbd_raw\ -nturgb+d_skeletons\ ... -samples_with_missing_skeletons.txt
-
Preprocess the data with
python data_gen/ntu_gendata.py
-
Generate the forward data with:
python data_gen/gen_forward_data.py
-
Generate the reverse data with:
python data_gen/gen_reverse_data.py
-
Generate the bone data with:
python data_gen/gen_forward_bone_data.py
python data_gen/gen_reverse_bone_data.py
For 'NTU RGB+D 120':
-
Download the raw data from 'NTU-RGB+D 120' (https://rose1.ntu.edu.sg/dataset/actionRecognition/).
-
Then put them under the data directory:
-data\ -nturgbd_raw\ -nturgb+d120_skeletons\ ... -NTU_RGBD120_samples_with_missing_skeletons.txt
-
Preprocess the data with
python data_gen/ntu120_gendata_xsub_train.py
python data_gen/ntu120_gendata_xsub_val.py
python data_gen/ntu120_gendata_xset_train.py
python data_gen/ntu120_gendata_xset_val.py
-
Generate the forward data with:
python data_gen/gen_forward_data_ntu120_xsub_train.py
python data_gen/gen_forward_data_ntu120_xsub_val.py
python data_gen/gen_forward_data_ntu120_xset_train.py
python data_gen/gen_forward_data_ntu120_xset_val.py
-
Generate the reverse data with:
python data_gen/gen_reverse_data_ntu120_xsub_train.py
python data_gen/gen_reverse_data_ntu120_xsub_val.py
python data_gen/gen_reverse_data_ntu120_xset_train.py
python data_gen/gen_reverse_data_ntu120_xset_val.py
-
Generate the bone data with:
python data_gen/gen_forward_bone_data_ntu120.py
python data_gen/gen_reverse_bone_data_ntu120.py
For 'UAV-Human':
-
Download the raw data from 'UAV-Human' (https://github.com/SUTDCV/UAV-Human).
-
For both benchmarks of UAV-HUman, you may have to classify the training set and the testing set according to the ID.
-
Then put them under the data directory:
-data\ -uav\ -train\ -put the training samples under this file\ -test\ -put the testing samples under this file\
-
We experimented with two benchmarks in two folders,i.e., UAVAGCN and UAVAGCN1.
-
It is recommended to use the method developed by the author for preprocessing (https://github.com/SUTDCV/UAV-Human/tree/master/uavhumanposetools), but the preprocess file needs to be replaced. Note that generate_uav_data.py is based on 2s-AGCN and UAV-Human.
-
Pay attention to the setting of the number of joint points, the maximum number of frames fu, and the downsampling downsampling parameters S.
-
First, preprocess the data with
python data_gen/generate_uav_data.py
-
Then put them under the data directory:
-data\ -uav\ -train_data.npy -train_label.pkl -test_data.npy -test_label.pkl
-
Finally, preprocess the data with
python data_gen/gen_forward_data.py
python data_gen/gen_reverse_data.py
python data_gen/gen_forward_bone_data.py
python data_gen/gen_reverse_bone_data.py
Training & Testing
For 'NTU RGB+D':
-
X-sub (Cross-Subject):
python main.py --config ./config/nturgbd-cross-subject/train_forward.yaml
python main.py --config ./config/nturgbd-cross-subject/train_reverse.yaml
python main.py --config ./config/nturgbd-cross-subject/train_forward_bone.yaml
python main.py --config ./config/nturgbd-cross-subject/train_reverse_bone.yaml
python main.py --config ./config/nturgbd-cross-subject/test_forward.yaml
python main.py --config ./config/nturgbd-cross-subject/test_reverse.yaml
python main.py --config ./config/nturgbd-cross-subject/test_forward_bone.yaml
python main.py --config ./config/nturgbd-cross-subject/test_reverse_bone.yaml
-
X-view (Cross-View):
python main.py --config ./config/nturgbd-cross-view/train_forward.yaml
python main.py --config ./config/nturgbd-cross-view/train_reverse.yaml
python main.py --config ./config/nturgbd-cross-view/train_forward_bone.yaml
python main.py --config ./config/nturgbd-cross-view/train_reverse_bone.yaml
python main.py --config ./config/nturgbd-cross-view/test_forward.yaml
python main.py --config ./config/nturgbd-cross-view/test_reverse.yaml
python main.py --config ./config/nturgbd-cross-view/test_forward_bone.yaml
python main.py --config ./config/nturgbd-cross-view/test_reverse_bone.yaml
-
Finally combine the generated scores with:
python ensemble_4s.py --datasets ntu/xsub
python ensemble_4s.py --datasets ntu/xview
For 'NTU RGB+D 120':
-
Take the benchmark X-sub (Cross-Subject) as an example:
python main.py --config ./config/nturgbd120-cross-subject/train_forward.yaml
python main.py --config ./config/nturgbd120-cross-subject/train_reverse.yaml
python main.py --config ./config/nturgbd120-cross-subject/train_forward_bone.yaml
python main.py --config ./config/nturgbd120-cross-subject/train_reverse_bone.yaml
python main.py --config ./config/nturgbd120-cross-subject/test_forward.yaml
python main.py --config ./config/nturgbd120-cross-subject/test_reverse.yaml
python main.py --config ./config/nturgbd120-cross-subject/test_forward_bone.yaml
python main.py --config ./config/nturgbd120-cross-subject/test_reverse_bone.yaml
-
Perform similar operations on another benchmark X-set (Cross-Setup).
python main.py --config ./config/nturgbd120-cross-setup/train_forward.yaml
python main.py --config ./config/nturgbd120-cross-setup/train_reverse.yaml
python main.py --config ./config/nturgbd120-cross-setup/train_forward_bone.yaml
python main.py --config ./config/nturgbd120-cross-setup/train_reverse_bone.yaml
python main.py --config ./config/nturgbd120-cross-setup/test_forward.yaml
python main.py --config ./config/nturgbd120-cross-setup/test_reverse.yaml
python main.py --config ./config/nturgbd120-cross-setup/test_forward_bone.yaml
python main.py --config ./config/nturgbd120-cross-setup/test_reverse_bone.yaml
-
Finally combine the generated scores with:
python ensemble120_4s.py --datasets ntu/xsub
python ensemble120_4s.py --datasets ntu/xset
For 'UAV-Human':
-
Take the benchmark CSv1 as an example (Note that we put the two benchmarks in two different folders.):
python main.py --config ./config/uav/train_forward.yaml
python main.py --config ./config/uav/train_reverse.yaml
python main.py --config ./config/uav/train_forward_bone.yaml
python main.py --config ./config/uav/train_reverse_bone.yaml
python main.py --config ./config/uav/test_forward.yaml
python main.py --config ./config/uav/test_reverse.yaml
python main.py --config ./config/uav/test_forward_bone.yaml
python main.py --config ./config/uav/test_reverse_bone.yaml
-
Finally combine the generated scores with:
python ensemble_uav_4s.py --datasets uav
Acknowledgements
This work is based on
2s-AGCN (https://github.com/lshiwjx/2s-AGCN)
Thanks to the original authors for their work! Our work is only the improvement of the data preprocessing part based on it. However, we hope that the research content of this forward and inverse sequences can be inspiring for someone.
Meanwhile, we are very grateful to the creators of these three datasets, i.e., NTU RGB+D 60, NTU RGB+D 120, UAV-Human. Your selfless work has made a great contribution to the computer vision community!
Last but not least, the authors are very grateful for the selfless and constructive suggestions of the reviewers.
Citation
@article{HU2022624,
title = {Forward-reverse adaptive graph convolutional networks for skeleton-based action recognition},
journal = {Neurocomputing},
volume = {492},
pages = {624-636},
year = {2022},
author = {Zesheng Hu and Zihao Pan and Qiang Wang and Lei Yu and Shumin Fei},
}
Contact
If you find that the above description is not clear, or you have other issues that need to be communicated when conducting the experiment, please leave a message on github. Besides, we look forward to discussions about skeleton-based action recognition.
Now I am a Phd student at Nanjing Normal University. Feel free to contact me via email:
`zeshenghu@njnu.edu.cn`