Awesome
Awesome-Skeleton-based-Action-Recognition <!-- omit in toc -->
If you have any problems, suggestions or improvements, please submit the issue or PR.
TODO <!-- omit in toc -->
- Paper list
- supervised methods
- semi-supervised methods
- unsupervised methods
- adversarial methods
- Leaderboard for supervised methods
- NTU RGB+D
- NTU RGB+D 120
- Leaderboard for unsupervised and semi-supervised methods
Contents <!-- omit in toc -->
- Misc
- Datasets
- Semi-supervised and Unsupervised Skeleton Rrepresentation
- Supervised Skeleton-based Action Recognition
- LeaderBoard
Misc
- Microsoft Kinect sensor and its effect (IEEE Multimedia 2012) [paper]
- Other GITHUB Repos for Skeleton-based Action Recognition Papers
- Quo Vadis, Skeleton Action Recognition? : A web portal as part on human action understanding from skeleton data. The
portal contains
- (1) an interactive dashboard showing detailed performance plots of top performing models for NTU-120 dataset.
- (2) code and pre-trained models for top-performers, including novel ensemble which achieves state-of-the-art performance on NTU-120
- (3) new skeleton action datasets (skeletics-152, skeleton-mimetics) and pre-trained models.
Datasets
-
(New! 2021) PoseC3D 2D Skeleton Dataset (FineGYM, NTURGB-D, Kinetics, Volleyball) [arxiv, Github]
-
(2017) SYSU 3D Human-Object Interaction Dataset (SYSU)
-
(2015) UWA3D Multiview Activity II Dataset (UWA3D) [download]
-
(2014) Northwestern-UCLA Dataset (N-UCLA) [donwload]
This section only shows some popular or new datasets, other available datasets for 3D action recognition and their statistics can be found in the following Table from the journal paper of NTU RGB+D 120 Dataset (TPAMI).
Semi-supervised and Unsupervised Skeleton Rrepresentation
arXiv
- [Kinetic-GAN] Generative Adversarial Graph Convolutional Networks for Human Action Synthesis (WACV 2022)[arxiv] [Github]
- Augmented skeleton based contrastive action learning with momentum lstm for unsupervised action recognition [arxiv] [Github]
- Skeleton-DML: Deep Metric Learning for Skeleton-Based One-Shot Action Recognition [arxiv][Github]
- Sparse Semi-Supervised Action Recognition with Active Learning [arxiv]
- 3D Human Action Representation Learning via Cross-View Consistency Pursuit (CVPR 2021)[arxiv][Github]
- STAR: Sparse Transformer-based Action Recognition [arxiv] [Github]
papers
- Skeleton-Contrastive 3D Action Representation Learning (ACM MM 2021) [arxiv] [Github]
- Adversarial Self-Supervised Learning for Semi-Supervised 3D Action Recognition (ECCV 2020) [arxiv]
- Unsupervised 3D Human Pose Representation with Viewpoint and Pose Disentanglement (ECCV 2020) [arxiv] [Github]
- Predict & cluster: Unsupervised skeleton based action recognition (CVPR 2020) [arxiv] [Github]
- Ms2l: Multi-task self-supervised learning for skeleton based action recognition (ACMMM 2020) [arxiv]
- Unsupervised feature learning of human actions as trajectories in pose embedding manifold (WACV 2018) [arxiv]
- Unsupervised representation learning with long-term dynamics for skeleton based action recognition (AAAI 2018) [arxiv] [Github]
Skeleton-based Action Recognition under Adversarial Attack
- Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack (CVPR 2021) [arxiv]
- BASAR:Black-box Attack on Skeletal Action Recognition (CVPR 2021) [arxiv]
Supervised Skeleton-based Action Recognition
arXiv papers
This section only includes the last five papers since 2018 in arXiv.org. Note that arXiv papers without available codes are not included in the leaderboard of performance.
- [Sym-GNN] Symbiotic Graph Neural Networks for 3D Skeleton-based Human Action Recognition and Motion Prediction [arxiv] [Github]
- [DenseIndRNN] Deep Independently Recurrent Neural Network (Preprint) [arxiv] [Github]
- Optimized Skeleton-based Action Recognition via Sparsified Graph Regression [arxiv]
- Skeleton-Based Action Recognition with Synchronous Local and Non-local Spatio-temporal Learning and Frequency Attention [arxiv]
- [DSTA-Net] Decoupled Spatial-Temporal Attention Network for Skeleton-Based Action Recognition [arxiv]
- Skeleton-Based Action Recognition with Multi-Stream Adaptive Graph Convolutional Networks [arxiv]
- Quo Vadis, Skeleton Action Recognition ? [arxiv] [Github]
- SynSE: Syntactically Guided Generative Embeddings for Zero Shot Skeleton Action Recognition [arxiv] [Github]
- Leveraging Third-Order Features in Skeleton-Based Action Recognition [arxiv][Github]
Survey
- A Comparative Review of Recent Kinect-based Action Recognition Algorithms (TIP 2019) [arxiv]
2022
- [PoseC3D] Revisiting Skeleton-based Action Recognition [paper] [Code] (CVPR 2022 Oral)
- [PSUMNet] Unified Modality Part Streams are All You Need for Efficient Pose-based Action Recognition (ECCV 2022 WCPA) [paper][Github]
2021
- [MMDGCN] Multi-scale Mixed Dense Graph Convolution Network for Skeleton-based Action Recognition (IEEE Access) [paper]
- Quo Vadis, Skeleton Action Recognition ? [paper] [Github] (IJCV)
- Constructing Stronger and Faster Baselines for Skeleton-based Action Recognition [paper] [Github] (submitted to TPAMI)
- [CTR-GCN] Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition [paper] [Github] (ICCV 2021)
2020
-
[MV-IGNET] Learning Multi-View Interactional Skeleton Graph for Action Recognition (TPAMI 2020) [paper][Github]
-
[P&C FW-AEC] PREDICT & CLUSTER: Unsupervised Skeleton Based Action Recognition (CVPR 2020) [paper]
-
[CA-GC] Context Aware Graph Convolution for Skeleton-Based Action Recognition (CVPR 2020) [paper]
-
[Shift-GCN] Skeleton-Based Action Recognition With Shift Graph Convolutional Network (CVPR 2020) [paper][Github]
-
[DMGNN] Dynamic Multiscale Graph Neural Networks for 3D Skeleton Based Human Motion Prediction (CVPR 2020) [paper]
-
[SGN] Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition (CVPR 2020) [arxiv][Github]
-
[MS-G3D] Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition (CVPR 2020) [arxiv] [Github]
-
[Dynamic GCN] Dynamic GCN: Context-enriched Topology Learning for Skeleton-based Action Recognition (ACM-MM 2020)[arxiv]
-
[GCN-NAS] Learning Graph Convolutional Network for Skeleton-based Human Action Recognition by Neural Searching (AAAI 2020) [arxiv] [Github]
-
[DecoupleGCN-DropGraph] Decoupling GCN with DropGraph Module for Skeleton-Based Action Recognition (ECCV 2020) [arxiv] [Github]
-
[PA-ResGCN] Stronger, Faster and More Explainable: A Graph Convolutional Baseline for Skeleton-based Action Recognition (ACM-MM 2020) [arxiv] [Github]
-
[Poincare-GCN] Mix Dimension in Poincaré Geometry for 3D Skeleton-based Action Recognition (ACM-MM 2020) [arxiv]
-
[STIGCN] Spatio-Temporal Inception Graph Convolutional for Skeleton-Based Action Recognition (ACM-MM 2020) [arxiv]
-
[JOLO-GCN] JOLO-GCN: Mining Joint-Centered Light-Weight Information for Skeleton-Based Action Recognition (WACV 2021) [arxiv]
-
[ST-TR-AGCN] Spatial Temporal Transformer Network for Skeleton-based Action Recognition (Under submission at Computer Vision and Image Understanding (CVIU)) [arxiv] [Github]
-
[PCRP] Prototypical Contrast and Reverse Prediction: Unsupervised Skeleton Based Action Recognition [arxiv] [Github]
2019
- NTU-RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding (TPAMI 2019) [arxiv] [Homepage] [Github]
- [VA-NN] View Adaptive Neural Networks for High Performance Skeleton-based Human Action Recognition (TPAMI 2019) [arxiv] [Github]
- Bayesian Graph Convolutional LSTM for Skeleton Based Action Recognition (ICCV 2019) [arxiv]
- [2s-SDGCN] Spatial Residual Layer and Dense Connection Block Enhanced Spatial Temporal Graph Convolutional Network for Skeleton-Based Action Recognition (ICCV Workshop 2019) [paper]
- [DGNN] Skeleton-Based Action Recognition With Directed Graph Neural Networks (CVPR 2019) [paper] [unofficial PyTorch implementation]
- [2s-AGCN] Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition (CVPR 2019) [paper] [Github]
- [AS-GCN] Actional-Structural Graph Convolutional Networks for Skeleton-based Action Recognition (CVPR 2019) [arxiv] [Github]
- [AGC-LSTM] An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition (CVPR 2019) [arxiv]
- [Motif-STGCN] Graph CNNs with Motif and Variable Temporal Block for Skeleton-based Action Recognition (AAAI 2019) [arxiv] [Github]
- Richly Activated Graph Convolutional Network for Action Recognition with Incomplete Skeletons (ICIP 2019) [arxiv] [Github]
- [TSRJI] Skeleton Image Representation for 3D Action Recognition based on Tree Structure and Reference Joints (SIBGRAPI) [arxiv] [Github]
- [SkeleMotion] SkeleMotion: A New Representation of Skeleton Joint Sequences Based on Motion Information for 3D Action Recognition (AVSS) [arxiv] [Github]
2018
- Beyond Joints: Learning Representations from Primitive Geometries for Skeleton-based Action Recognition and Detection (TIP 2018) [paper] [Github]
- [DPRL] Deep progressive reinforcement learning for skeleton-based action recognition (CVPR 2018) [paper]
- [SR-TSL] Skeleton based action recognition with spatial reasoning and temporal stack learning (ECCV 2018) [arxiv]
- [HCN] Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation (IJCAI 2018) [arxiv] [Reimplementation]
- [MAN] Memory attention networks for skeleton-based action recognition (IJCAI 2018) [arxiv] [Github]
- [ST-GCN] Spatial temporal graph convolutional networks for skeleton-based action recognition (AAAI 2018) [arxiv] [Github]
- Spatio-temporal graph convolution for skeleton based action recognition (AAAI 2018) [arxiv]
- Part-based Graph Convolutional Network for Action Recognition (BMVC 2018) [arxiv] [Github]
- A Fine-to-Coarse Convolutional Neural Network for 3D Human Action Recognition (BMVC 2018) [arxiv]
- A Large-scale Varying-view RGB-D Action Dataset for Arbitrary-view Human Action Recognition (ACMMM 2018) [arxiv]
2017
- Jointly learning heterogeneous features for RGB-D activity recognition (TPAMI 2017) [paper]
- [Visualization CNN] Enhanced skeleton visualization for view invariant human action recognition (Pattern Recognition 2017) [paper]
- Global context-aware attention lstm networks for 3d action recognition (CVPR 2017) [paper]
- [Two-stream RNN] Modeling temporal dynamics and spatial configurations of actions using two-stream recurrent neural networks (CVPR 2017) [arxiv] [Github]
- [C-CNN + MTLN] A new representation of skeleton sequences for 3d action recognition (CVPR 2017) [arxiv]
- [Ensemble TS-LSTM] Ensemble deep learning for skeleton-based action recognition using temporal sliding lstm networks (ICCV 2017) [paper] [Github]
- [VA-LSTM] View adaptive recurrent neural networks for high performance human action recognition from skeleton data (ICCV 2017) [arxiv]
- Learning action recognition model from depth and skeleton videos (ICCV 2017) [paper]
- [STA-LSTM] An end-to-end spatio-temporal attention model for human action recognition from skeleton data (AAAI 2017) [arxiv]
- Skeleton-based action recognition using LSTM and CNN (ICME Workshop 2017) [arxiv]
- Skeleton-based action recognition with convolutional neural networks (ICME Workshop 2017) [arxiv]
- PKU-MMD: A large scale benchmark for continuous multi-modal human action understanding (ACMMM Workshop 2017) [arxiv]
- [Temporal Conv] Interpretable 3d human action analysis with temporal convolutional networks (CVPR Workshop 2017) [arxiv]
before 2017
- [Trust Gate ST-LSTM] Spatio-temporal lstm with trust gates for 3d human action recognition (ECCV 2016) [arxiv] [Github]
- [Part-aware LSTM] NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis (CVPR 2016) [arxiv]
- Rolling rotations for recognizing human actions from 3d skeletal data (CVPR 2016) [paper]
- Co-occurrence feature learning for skeleton based action recognition using regularized deep lstm networks (AAAI 2016) [paper]
- Skeleton based action recognition with convolutional neural network (ACPR 2015) [paper]
- [H-RNN] Hierarchical recurrent neural network for skeleton based action recognition (CVPR 2015) [paper]
- Jointly learning heterogeneous features for rgb-d activity recognition (CVPR 2015) [paper]
- [LieGroup] Human action recognition by representing 3d skeletons as points in a lie group (CVPR 2014) [paper]
- Human action recognition using a temporal hierarchy of covariance descriptors on 3d joint locations (IJCAI 2013) [paper]
LeaderBoard
The section is being continually updated. We only show results on large-scale dataset NTU-RGB+D and NTU-RGB+D 120.
NTU-RGB+D
Year | Methods | Cross-Subject | Cross-View |
---|---|---|---|
2014 | Lie Group | 50.1 | 52.8 |
2015 | H-RNN | 59.1 | 64.0 |
2016 | Part-aware LSTM | 62.9 | 70.3 |
2016 | Trust Gate ST-LSTM | 69.2 | 77.7 |
2017 | Two-stream RNN | 71.3 | 79.5 |
2017 | STA-LSTM | 73.4 | 81.2 |
2017 | Ensemble TS-LSTM | 74.6 | 81.3 |
2017 | Visualization CNN | 76.0 | 82.6 |
2017 | C-CNN + MTLN | 79.6 | 84.8 |
2017 | Temporal Conv | 74.3 | 83.1 |
2017 | VA-LSTM | 79.4 | 87.6 |
2018 | Beyond Joints | 79.5 | 87.6 |
2018 | ST-GCN | 81.5 | 88.3 |
2018 | DPRL | 83.5 | 89.8 |
2019 | Motif-STGCN | 84.2 | 90.2 |
2018 | HCN | 86.5 | 91.1 |
2018 | SR-TSL | 84.8 | 92.4 |
2018 | MAN | 82.7 | 93.2 |
2019 | RA-GCN | 85.9 | 93.5 |
2019 | DenseIndRNN | 86.7 | 93.7 |
2018 | PB-GCN | 87.5 | 93.2 |
2019 | AS-GCN | 86.8 | 94.2 |
2019 | VA-NN (fusion) | 89.4 | 95.0 |
2019 | AGC-LSTM (Joint&Part) | 89.2 | 95.0 |
2019 | 2s-AGCN | 88.5 | 95.1 |
2020 | SGN | 89.0 | 94.5 |
2020 | GCN-NAS | 89.4 | 95.7 |
2019 | 2s-SDGCN | 89.6 | 95.7 |
2019 | DGNN | 89.9 | 96.1 |
2020 | MV-IGNET | 89.2 | 96.3 |
2020 | 4s Shift-GCN | 90.7 | 96.5 |
2020 | DecoupleGCN-DropGraph | 90.8 | 96.6 |
2020 | PA-ResGCN-B19 | 90.9 | 96.0 |
2020 | MS-G3D | 91.5 | 96.2 |
2021 | EfficientGCN-B4 | 91.7 | 95.7 |
2021 | CTR-GCN | 92.4 | 96.8 |
2022 | PoseC3D | 94.1 | 97.1 |
2022 | PSUMNet | 92.9 | 96.7 |
NTU-RGB+D 120
Most of existing methods have not been tested on this new dataset yet, and some results can be found in the paper of NTU RGB+D 120 Dataset (TPAMI).
Year | Methods | Cross-Subject | Cross-Setup |
---|---|---|---|
2019 | SkeleMotion (Magnitude-Orientation) | 62.9 | 63.0 |
2019 | SkeleMotion + Yang et al | 67.7 | 66.9 |
2019 | TSRJI | 67.9 | 59.7 |
2020 | SGN | 79.2 | 81.5 |
2020 | MV-IGNET | 83.9 | 85.6 |
2020 | 4s Shift-GCN | 85.9 | 87.6 |
2020 | DecoupleGCN-DropGraph | 86.5 | 88.1 |
2020 | MS-G3D | 86.9 | 88.4 |
2022 | PoseC3D | 86.9 | 90.3 |
2020 | PA-ResGCN-B19 | 87.3 | 88.3 |
2021 | EfficientGCN-B4 | 88.3 | 89.1 |
2021 | CTR-GCN | 88.9 | 90.6 |
2022 | PSUMNet | 89.4 | 90.6 |